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EXPRESSION CLONING PROCESSES FOR THE DISCOVERY, 
CHARACTERIZATION AND ISOLATION OF GENES ENCODING 
POLYPEPTIDES WITH A PREDETERMINED PROPERTY 

5 I. FIELD OF THE INVENTION 

The present in invention relates, generally, to the field of expressed gene technology 
and expression cloning. More specifically, the present invention relates to the 
identification, characterization, and isolation of transcribed nucleic acid sequences encoding 
polypeptides having a predetermined property, e.g., cellular localization, structure, 
10 enzymatic function, or affinity to other molecules, and the production of the corresponding 
polypeptides. 

II. BACKGROUND OF THE INVENTION 

General Background. Proteins are the most prominent biomolecules in living 

15 organisms; in addition to their role as structural components and catalysts, they play a 
crucial role in regulatory processes. Both regulation of cell proliferation and metabolic 
functions are largely controlled and effected by the cooperation of numerous cellular and 
extracellular proteins. Lehninger, A.L., 1975, " Biochemistry ". Worth Publishers Inc., New 
York, New York. For example, signal transduction pathways of many kinds that affect 

20 critical physiological responses operate through proteins by way of their intermolecular 
interactions. Metzler, D. E., 1977, " Biochemistry ", Academic Press Inc., London. 
Furthermore, the transcription of genes and the regulation of such transcription is 
dependent upon and controlled by the interdependence of numerous protein factors. 
Wainwright, S. D., " Control Mechanisms and Protein Synthesis ", Columbia University 

25 Press, New York and London. 

Proper functioning of a multicellular organism does not only depend on the 
interaction of biomolecules within the cell, but individual cells must also communicate 
appropriately. Such intercellular communication, and interaction of cells with the 
environment is often realized by the actions of receptors on the extracellular surface and 

30 associated intracellular signal transduction mechanisms. Poste, G., Nicholson, G. L., 1976, 
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" The Cell Surface ", Elseviere, Amsterdam. The information is communicated through the 
cell environment to regulate gene expression or protein activities in the cell. Secreted 
proteins in the extracellular environment thereby exert potent regulatory effects on certain 
cellular functions. 

5 In view of the above outlined, very simplified paradigm of cell function, particular 

properties of a protein, including cellular localization, structure, affinity to a binding 
partner, or enzymatic activity under physiological conditions appear to be highly indicative 
of its type of function. With respect to a particular cellular localization, secreted proteins, 
for example, are likely to function as intercellular communicators of signals, while 

10 membrane associated receptors having an extracellular and intracellular domain most likely 
transmit an extracellular signal into the cell. Cytoplasmic proteins may function as 
intracellular signal transmitters and coordinators. Jeter, J. R., Cameron, I. L.„ Padilla, I. L., 
Padilla, G. M., Zimmerman, A. M., 1978, " Cell Cycle Regulation ". London. Nuclear 
proteins are likely to be involved in certain aspects of gene regulation. Zawel et al y 1995, 

15 Annu. Rev. Biochem. 64:533-561 . Mature proteins found in the Golgi or the ER may have 
regulatory roles in the post-translational processing of protein precursors, e.g., cleavage or 
addition of carbohydrates. Hirschberg, 1987, C. Annu. Rev. Biochem. 56:63-87. 

Membrane-Associated Proteins. For many years, the paradigm of cell function has 
motivated numerous drug discovery programs to focus on identifying membrane-associated 

20 proteins, in particular new receptors, and their respective functions. Porter, R. and 
O'Connor, M., 1970, " Molecular Properties of Drug Receptors, Ciba Foundation 
Symposium ", J&A Churchill, London. Many examples in fact compel the conclusion that 
improper function of membrane receptors is a significant source of the development of 
serious metabolic and proliferative diseases such as cancer. For example, a certain form of 

25 Diabetes mellitus, i.e., the non-insulin-dependent diabetes (NIDDM) may be caused by 
mutations in the insulin receptor. Ullrich et al 9 1985, Nature 313 :756-761: Taira et al 9 
1989, Science 245 :63-66. Furthermore, 30% of all mammary carcinomas are associated 
with amplification of the receptor tyrosine kinase HER2. Bargman et ai, 1986, Cell 
45:649-657; Slamon et al 9 1989, Science 244:707-712. In addition to traditional drug 

30 discovery programs targeting receptors, an ambitiously pursued objective has become to 
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identify membrane-associated receptors as possible gene therapy targets using comparative 
genomics, which allows determination of changes in gene expression under, e.g., 
pathological conditions. Wels, e/a/., 1995, Gene 159(1 ):73-80. 

Secreted Proteins. While receptors have mostly been considered as important 
5 potential therapeutic targets, secreted proteins are of particular interest as potential 

therapeutic agents. They often have a signalling or hormone function, and hence have a 
high and specific biological activity. Schoen, F. J., 1 994, " Robbins Pathologic Basis of 
Disease" , W.B. Saunders Company, Philadelphia. For example, secreted proteins control 
physiological reactions such as differentiation and proliferation, blood clotting and 

10 thrombolysis, somatic growth and cell death, and immune response. Schoen, F. J., 1994, 
" Robbins Pathologic Basis of Disease" , W.B. Saunders Company, Philadelphia. 

Significant resources and research efforts have been expended for the discovery and 
investigation of new secreted proteins controlling biological functions. Many of such 
secreted proteins, including cytokines and peptide hormones, are manufactured and used as 

15 therapeutic agents. Zavyalov et al , 1 997, APMIS 105(3): 161-186. However, of the several 
thousand expected secreted proteins, only a few are currently used as therapeutic 
compounds. It can be expected that many of the so far undiscovered secreted proteins of 
the human organism are effective in correcting physiological disorders and are thus 
promising candidates for new drugs. 

20 in the past, novel cytokines and hormone proteins were identified by assaying a 

certain cell type for its response to protein fractions or purified proteins. Lauffenburger et 
ai, 1996, Biotechnology and Bioengineering 52(l):61-80. Other investigators have used 
sequence similarities on DNA level to clone novel interferons and interleukins. Nabori et 
ai, 1992, Analyt. Biochem. 205 (1)142-46. In again another approach, differential display 

25 techniques were used to compare the expression patterns of stimulated versus unstimulated 
cells. Nagata et al, 1980, Nature 287 :401-408. All these methods may yield identification 
and isolation of certain secreted polypeptides. 

Recently, a screening method for the identification of cDNA encoding novel 
secreted mammalian proteins in yeast using the invertase gene as a selection marker has 

30 been described. See, United States Patent No. 5,536,637 (the "'637 Patent"). The disclosed 
technology relies on the concept that leader sequences of mammalian cDNAs are effective 
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in exporting the invertase protein depleted of its leader sequence. This approach yields 
partial cDNAs which in turn can be used to screen a full-length cDNA library. The novel 
protein of interest can then be manufactured by standard, but laborious, techniques, 
including subcloning, transforming a recombinant host, expression, development and 

5 implementation of a purification process. Furthermore, since the assays described in the 
'637 Patent are performed in yeast, the glycosylation pattern of the isolated products will 
differ significantly from the natural product produced in mammalian cells. This difference 
is a major impediment in view of the fact that an extremely important feature of secreted 
proteins (as it is true for the extracellular domain of receptors) is their glycosylation pattern 

10 and carbohydrate composition. Rademacher et al, 1988, Annu. Rev. Biochem. 57:785-838. 
Nuclear Proteins. In the nucleus, both replication of DNA and transcription of 
genes is actually implemented. Many nuclear proteins are directly involved in these 
processes as transcription factors, as cell cycle regulators, or both. Some nuclear proteins 
are responsible for turning on expression of certain metabolic proteins in response to 

15 environmental changes. Zawel et ai 9 1995, Annu. Rev. Biochem. 64:533-561. Many others 
are directly involved in the regulation of cell proliferation. Jeter, J. R., Cameron, I. L., 
Padilla, G. M., Zimmerman, A. M., 1978, " Cell Cycle Regulation ". London. Proteins in 
this latter class fall into two general categories: (1) dominant transforming genes, including 
oncogenes; and (2) recessive cell proliferation genes, including tumor suppressor genes and 

20 genes encoding products involved in programmed cell death ("apoptosis"). 

Oncogenes generally encode proteins that are associated with the promotion of cell 
growth. Because cell division is a crucial part of normal tissue development and continues 
to play an important role in tissue regeneration, properly regulated oncogene activity is 
essential for the survival of the organism. However, inappropriate expression or 

25 improperly controlled activation of oncogenes may drive uncontrolled cell proliferation and 
result in the development of severe diseases, such as cancer. Weinberg, 1994, CA Cancer 
J. Clin. 44:160-170. 

Tumor suppressor genes, on the other hand, normally act as "brakes" on cell 
proliferation, thus opposing the activity of oncogenes. Accordingly, inactivation of tumor 

30 suppressor genes, e.g. , through mutations or the removal of their growth inhibitory effects 
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may result in the loss of growth control, and cell proliferative diseases such as cancer may 

develop. Weinberg, 1994, CA Cancer J. Clin. 44:160-170. 

Related to tumor suppressor genes are genes whose product is involved in the 

control of apoptosis; rather than regulating proliferation of cells, they influence the survival 
5 of cells in the body. In normal cells, surveillance systems are believed to ensure that the 

growth regulatory mechanisms are intact; if abnormalities are detected, the surveillance 

system switches on a suicide program that culminates in apoptosis. 

Several genes that are involved in the process of apoptosis have been described. 

See, for example, Collins and Lopez Rivas, 1993, TIBS 18:3 07-308; Martin et al. 9 1994, 
10 TIBS 19:26-30. Cells that are resistant to apoptosis have an advantage over normal cells, 

and tend to outgrow their normal counterparts and dominate the tissue. As a consequence, 

inactivation of genes involved in apoptosis may result in the progression of tumors, and, in 

fact, is an important step in tumorigenesis. 

Accordingly, the identification of nuclear proteins addresses two areas of interest. 
15 First, oncogenes are prone to be valuable targets for the development of highly specific 

drugs for the treatment of cancer. Secondly, tumor suppressors and apoptosis inducing 

proteins can be useful as agents for the treatment of cancer. 

Other Cellular Localizations. Also many of the remaining cellular localizations are 

associated with particular functions. For example, most metabolic enzymes are located in 
20 the mitochondria. Lehninger, A. L., 1975, " Biochemistry ", Worth Publishers Inc., New 

York. Thus, mitochondrial proteins could reveal targets for the treatment of metabolic 

diseases. The Golgi apparatus and the ER are associated with post-translational processing 

of proteins; such processes are valuable targets for the treatment of diseases related to 

protein folding and glycosylation. Rademacher et al 9 1988, Ann. Rev. Biochem. 57:785- 
25 838. 

Enzymatic Activities. A particular enzymatic activity can be indicative of a 
protein's function. For example, kinases are frequently involved in signal transduction 
processes. 

Structures. Protein is also indicative of protein function. Indeed, proteins with 
30 similar structures frequently share certain functional properties. Thus, identifying proteins 
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having a structure related to that of a known protein having a particular function of interest 
can reveal additional proteins having such function. 

Generally, a method would be desirable which allows one to pre-sort proteins 
according to a property of interest, e.g., localization in the cell, affinity to binding partners, 
5 enzymatic activity, structure and the like. Such a method would allow one to generate 
libraries of, e.g., secreted proteins, such as cytokines, membrane associated proteins, such 
as receptors, nuclear proteins, such as transcription factors, mitochondrial proteins, such as 
respiratory proteins, and so on. Further, it would be desirable to perform screening, 
isolation and production of any product in mammalian cells in order to achieve the proper 

10 glycosylation pattern. Finally, the most preferred method would additionally allow one to 
identify and isolate proteins of interest per se rather than a partial DNA sequence. 

The present invention addresses this need. The present invention provides methods 
and expression systems for the generation of expression libraries encoding polypeptides of 
a predetermined property, including but not limited to cellular localization, structure, 

15 enzymatic function, or affinity to other molecules. The methods and expression systems of 
the present invention allow one to identify and isolate nucleic acids encoding novel proteins 
of interest. The methods and expression systems furthermore provide a powerful system 
for the identification of thus far unelucidated receptor/ligand relationships. Since the 
methods provided can employed in a wide variety of host cell systems, including 

20 mammalian systems, they provide for expression products having an appropriate 
carbohydrate composition. 



III. SUMMARY OF THE INVENTION 

The present invention relates, generally, to the field of expressed gene technology 
25 and expression cloning. More specifically, the present invention relates to the 

identification, characterization, and isolation of transcribed nucleic acid sequences encoding 
polypeptides having a predetermined property, including, but not limited to, cellular 
localization, structure, enzymatic function, or affinity to other molecules and the production 
of the corresponding polypeptides. 
30 More specifically, the invention is directed to a method for identifying nucleic acids 

encoding proteins with a predetermined property of interest. Such a property may include a 
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particular cellular localization, structure, enzymatic function, or affinity to other molecules. 
In one embodiment of the invention, a plurality of eukaryotic host cells is provided, 
wherein each host cell has an expression system comprising a different member, each 
member comprising a recombinant nucleic acid encoding an exogenous protein operatively 

5 linked to a control element. In a second step, the eukaryotic host cells are cultured under 
conditions where the exogenous protein is expressed while expression of endogenous 
proteins of the eukaryotic host cell is suppressed. In this time window, the exogenous 
protein may optionally be labelled, or may be treated in a way that allows discrimination 
from the untreated exogenous proteins. Finally, the member or members of the expression 

10 system that encode the exogenous protein or proteins having the property of interest are 
identified. 

In another aspect of the invention, a method for identifying a recombinant nucleic 
acid encoding an exogenous protein having a property of interest is achieved by providing a 
plurality of eukaryotic host cells, wherein each host cell has an expression system 

15 comprising a different member, and each member comprises a recombinant virus having a 
recombinant nucleic acid encoding an exogenous protein operatively linked to a control 
element. The eukaryotic host cells are cultured to express the exogenous proteins, and 
expression systems expressing recombinant nucleic acid encoding an exogenous protein 
having the property of interest are identified. Optionally, the expression systems are 

20 capable of expressing exogenous proteins while endogenous protein production of the 

eukaryotic host cell is suppressed. Although any kind of recombinant eukaryotic virus can 
be used, particularly advantageous viruses are alpha viruses. In addition, the exogenous 
proteins can be preferentially labelled or distinguished from endogenous proteins. The 
recombinant virus may be capable of directing the generation of viral particles and 

25 replicating, or the recombinant virus may lack functions required for propagation. 

Also encompassed within the invention are methods for generating genetic 
expression libraries encoding proteins having a predetermined property of interest. In one 
aspect, such methods entail providing a plurality of eukaryotic host cells, wherein each host 
cell has an expression system comprising a different member, each member having a 

30 recombinant nucleic acid encoding an exogenous protein operatively linked to a control 
element, culturing the eukaryotic host cells under conditions where said exogenous protein 
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is expressed while expression of endogenous proteins of said eukaryotic host cell is 
suppressed, and identifying the members that express recombinant nucleic acids encoding 
exogenous proteins having the property of interest. In another aspect, the methods entail 
providing a plurality of eukaryotic host cells, wherein each host cell has an expression 

5 system comprising a different member, each member being a recombinant virus having a 
recombinant nucleic acid encoding an exogenous protein operatively linked to a control 
element, culturing the eukaryotic host cells to express the exogenous proteins, and 
identifying the members that express recombinant nucleic acids encoding exogenous 
proteins having the property of interest. The invention further includes libraries of proteins 

10 identified using such methods. Additionally, the invention encompasses nucleic acid 

libraries having a population of eukaryotic expression systems with a plurality of members, 
each member having a recombinant nucleic acid encoding an exogenous protein operatively 
linked to a control element for expression in eukaryotic host cells. In one embodiment, the 
control element directs the expression of the exogenous proteins while expression of 

15 endogenous proteins in the eukaryotic host cells are suppressed. In yet another 
embodiment, the control element is derived from a eukaryotic virus. 

IV. BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 depicts an experiment in which expression systems containing either a 

20 nucleic acid encoding a secreted protein or a nucleic acid encoding an intracellular protein 
from a mixture of nucleic acids were identified using compartment screening. FIGURE 1 A 
is an autoradiogram of labelled proteins precipitated from the supernatants of the screened 
cell cultures, as described in more detail in Example 1 . Lanes are as follows: lane 1- non- 
infected BHK 21 cells; lane 2- cells transfected with an expression system encoding an 

25 intracellular protein, pSinRepS lacZ; lane 6- cells transfected with an expression system 
encoding a secreted protein (EPO), pSinRep 5 EPO; lanes 3 to 5 contain mixtures of the 2 
expression systems in the ratios 90:10, 50:50, 10:90 (sinRepS lacZ:SinRep5 EPO) showing 
increasing amounts of EPO. Protein mass standard is shown on the left side; the molecular 
weight of EPO is indicated by the arrow. FIGURE IB is an autoradiogram of the labelled 

30 proteins in the corresponding cell pellets from lanes 1 and 2 of FIGURE 1 A, showing the 
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accumulation of lac Z protein in the cell pellet, and the shutdown of endogenous protein 
production, in cells infected with pSinRepS lacZ. 

FIGURE 2 depicts separation of labelled viral particles from secreted protein, as 
described in more detail below in Example 2. Shut-down of endogenous protein synthesis 

5 is apparent in lanes 4 to 6 (infection with TE 5 f 2J CAT) and in lanes 7 to 9 (infection with 
TE 5'2J EPO) as compared to lanes 1 to 3 (non-infected BHK 21 cells). Removal of viral 
particles is demonstrated by the absence of the characteristic pattern of Sindbis structural 
proteins (capsid, El and E2). Lane 9 shows diffusion of a protein of the size of EPO 
through the agarose. In lanes 4 to 9 a soluble protein of viral origin can be seen. It is 

10 assumed that this protein is released by proteolytic cleavage. Labelled protein was 

collected at different time points: 2h (lanes 1,4,7), 4h (lanes 2,5,8) and 8h (lanes 3,6,9). 
Protein mass standard is shown on the left side, the size of Sindbis viral glycoproteins is 
indicated by the upper arrow, the molecular weight of EPO is indicated by the lower arrow. 
FIGURE 3 demonstrates that release of viral soluble proteins is reduced by protease 

15 inhibitors. As described below in Example 3, 10 6 BHK 21 cells in a 35mm dish were 

infected with TE 5'2J CAT. Varying concentrations of the Protease inhibitor cocktail (1 00, 
20, 10, 5, 1 fil per ml, lanes 1 to 5) were applied in the 4 mm agarose overlay. The 
molecular mass standard is shown on the left; the arrow indicates the expected mass of 
Sindbis glycoprotein El. 

20 FIGURE 4 shows identification of an expression system containing a nucleic acid 

encoding a protein with a predetermined enzymatic activity in a semi-solid medium 
screening. Confluent 35mm dishes of BHK 21 cells were infected with 200 pfs SinRep 
lacZ (Blot 1 on the left) or a mixture of 200 pfs SinRep lacZ / 20 pfu of SinRep 5 SEAP 
(Blot 2 on the right). Enzymatic activity was detected on the filters with AP staining. Blot 

25 i (containing only a lacZ expression system) is negative in SEAP activity whereas blot 2 
shows 2 distinct areas with SEAP activity (indicated by arrows). 

FIGURE 5 is a graph of the pSIN vectors used to illustrate the invention. The 
source and construction of these vectors is described in Table I. 

FIGURE 6 is a schematic representation of the pTE vectors used to illustrate the 

30 invention. Vectors shown are pTE5'2J (upper left); pTE5 f 2J CAT (upper right); pTE5'2J 
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SEAP (lower left); and pTE5'2J EPO (lower right). The source and construction of these 
vectors is described in Table I. 

FIGURES 7A and 7B depict approximately 20 pfu of pSinRepS SEAP and 780 
pfu pSinRepS LacZ were mixed in 1 ml Turbodoma HP- 1 and a 60 mm dish with BHK 2 1 

5 cells was infected for 2 hours. An agarose blot assay was performed as described in 
Example 10. FIGURE 7A shows the AP stained nitrocellulose membrane, blotted SEAP 
protein is represented by the violet spots of the developed X-ray film (FIGURE 7B, the AP 
stained membrane exposed to the X-ray film) where the black spots represent labeled 
secreted protein. Coordinates (indicated by arrows xl ,x2 9 x3 in the AP stained membrane 

10 and the corresponding arrows y 1 , y2, y3 in the developed X-ray film) of the spots with 

SEAP activity can be superimposed with the spots of labeled secreted proteins (compare xl 
with yl, x2 with y2 ? x3 with y3). 

FIGURES 8A, 8B and 8C depict pSinRep 5 SEAP mRNA and pDHEB ts mutant 
mRNAs were co-electroporated into BHK 21 cells. The cell supernatant of the 

15 electroporated cells was analyzed 20 hours post-electroporation by spotting 4 (il supernatant 
on a nitrocellulose strip and AP staining was done as described before. All the 
electroporations were positive for SEAP secretion (violet spots on nitrocellulose filter) as 
shown in FIGURE 8A. The first passage of the ts mutant viruses was tested for infectivity 
at 37°C and at 30°C. The supernatants of the infected cells were tested by AP staining for 

20 the secreted product SEAP, the upper row in FIGURE 8B represents the supernatant of 
cells incubated at 37°C 5 the lower row of FIGURE 8B represents the supernatant of cells 
incubated at 30°C. The double mutant pSinRep 5 SEAP / pDHEB ts2,20 produced 20 hours 
post-infection a low amount of SEAP (see FIGURE 8B) but virus particles were amplified 
and a high amount of product was detected after 48 hours post-infection, as shown in 

25 FIGURE 8C.). 

FIGURES 9 A, 9B, 9C and 9D depict pSinRep 5 WL13 R alpha-infected BHK 21 
cells (at an moi of approximately 0.1) were analyzed by Immunofluorescence. Expressed 
hIL13 R alpha was detected with IL13-flag, M2 antibody and antiMouse-FITC. FIGURE 
9A: BHK 21 infected at an moi of 0.1 with pSinRep 5 ML13 R alpha, analyzed and sorted 
30 W ith FACS; FIGURE 9B: the same cells as in FIGURE 9A with Immunfluorescence 
Microscopy; FIGURE 9C: pSinRep 5 LacZ infected cells as negative control; FIGURE 
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9D:1:100 diluted pSinRep 5 hill 3 R alpha virus in pSinRep 5 LacZ virus analyzed and 
sorted with FACS. 

FIGURE 10 depicts approximately 20 pfu of pSinRep 5 Epo / DHEB were mixed 
with 200 pfu of pSinRep 5 LacZ. The virus supernatant was incubated for 2 hours on 90% 
5 confluent BHK 21 cells in a 60 mm dish, before the supernatant was replaced with 3 ml of 
0.8% 41°C warm agarose in lx HP-1 medium. Two days later, a nitrocellulose membrane 
was applied on the agarose and diffusion blotting was proceeded for 14 hours. Blotted EPO 
was detected by immuno-detection with anti-EPO antibody (rabbit) and AP-conjugated 
anti-rabbit antibody. The violet spots dl,d2,d3 (and the other dark spots) represent the 
10 blotted EPO derived from the underlying virus plaque representing pSinRepS EPO virus. 

FIGURES 11A-11J depict the polynucleotide sequences of pSinRep 5, pSinRep 5 
EPO, pSinRep 5 hIL 1 3Ralpha, and pDH-EB. 

FIGURES 12A-12D depict the polynucleotide sequence of pTE5'2J. 

FIGURES 13A-13C depict the polynucleotide sequence of 987 BB neo. 
15 FIGURE 14 depicts the polynucleotide sequence of CAT. 

FIGURE 15 depicts the polynucleotide sequence of the Xbal/Apal fragment of 
synthetic erythropoietin. 

V. DEFINITIONS 

20 Terms used herein are in general as typically used in the art. The following terms 

are intended to have the following general meanings as they are used herein: 

The term "cellular localization" refers to a defined localization in the cellular 

context, including the extracellular space and any defined intracellular compartment, 

including, but not limited to, the nucleus, mitochondria, lysosomes, endoplasmatic 
25 reticulum (ER), Golgi, cytoplasm, cell membrane, endocytotic or exocytotic vesicles, 

cytoskeleton, peroxisomes. In microbes or plant cells, further cellular localizations may 

include the cell wall and chloroplast. 

The term "control element" refers to a nucleic acid component capable of directing 

expression of an operatively associated nucleic acid encoding a polypeptide. Generally, a 
30 "control element" comprises regulatory sequences required for transcription and/or 

translation of genetic information. The control element further comprises a component 
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which promotes, under certain conditions or circumstances, selective expression of the 
operatively linked nucleic acid, while expression of endogenous host proteins is suppressed. 
Optionally, the "control element" may further comprise components facilitating packaging 
of the operatively associated nucleic acid. 
5 The term "exogenous protein " refers to a protein which is encoded by a 

recombinant nucleic acid. If the "exogenous protein" is expressed in a cell, it is encoded by 
a recombinant nucleic acid which has been introduced into the cell or its progenitors. In the 
expression systems of the invention, the nucleic acid encoding the exogenous protein is 
operatively linked to a control element. In some circumstances, the exogenous protein can 
10 be the same protein as one endogenously produced by the cell. 

The term "virus" refers to a bio-entity capable of introducing genetic information 
into a cell. The genetic information may be introduced in the form of RNA, DNA, or 
derivatives thereof. The cell may be eukaryotic or prokaryotic. 

15 VI. DETAILED DESCRIPTION OF THE INVENTION 
A. General Overview Of The Invention 

The present invention is directed to methods and expression systems 
allowing rapid identification, characterization and isolation of transcribed nucleic acid 
sequences encoding polypeptides having a predetermined property, including, but not 

20 limited to, cellular localization, structure, enzymatic function, or affinity to other 
molecules. The invention is based, in part, on the inventors' surprising discovery of 
materials and compositions that allow the discrimination between a unique exogenous 
protein of interest and all other "common" proteins present in the system, including 
endogenous cellular proteins and recombinant proteins which are perpetual part of the 

25 expression system. Generally, the methods of the invention employ eukaryotic host cell 
systems, into which the expression systems are introduced for any further characterization 
according to the invention. Preferably, the employed expression systems have an inherent 
feature which allows introduction of only one expression system per eukaroytic cell. Such 
systems have the advantage that they allow rapid one step cloning of individual expressed 

30 sequences. 
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The system which is described in more detail below offers a number of advantages. 
First, the system of this invention is an expression cloning system which allows the direct 
cloning of full-length cDNAs in one step. Second, it facilitates rapid expression of the 
protein of interest without further laborious handling such as subcloning and establishment 

5 of a production cell line. Third, as the products may be expressed in mammalian cells, the 
system may yield correctly folded, glycosylated and active material. Fourth, the system 
allows rapid purification of the material without explicit knowledge of the nature of the 
protein. Fifth, it allows labelling of the material facilitating rapid identification of sites of 
binding in any context including animals. Finally, the system directly provides vectors 

10 useful for gene transfer into cultured cells, tissues, or animals. 

In one aspect, the invention discloses methods which allow the identification and 
isolation of novel and known proteins having a predetermined property, for example 
cellular localization. More specifically, within a plurality of expression systems encoding 
unique exogenous proteins, which are expressed in suitable host cell systems, the disclosed 

15 methods allow one to identify and isolate those expression systems which encode 

exogenous proteins located in a cellular compartment of interest. The method is based, in 
one of its aspects, on a unique feature of the employed expression system which allows, 
during a certain time window, high expression of the exogenous proteins while the 
synthesis of endogenous proteins is suppressed. If during this time window the expressed 

20 exogenous proteins are treated in a certain detectable way, for example by radioactive 

labelling, they may be identified easily within a chosen cellular fraction. Alternatively, all 
proteins expressed prior to the time window in which synthesis of endogenous protein is 
suppressed may be treated in an identifiable way; in this scenario the exogenous proteins 
which are expressed after such treatment, while expression of endogenous proteins is 

25 inhibited may be identified by not exhibiting the feature caused by the treatment. 

In one aspect, the methods of the invention may be used to identify and isolate 
individual expression systems encoding secreted proteins. During a time window in which 
expression of the endogenous proteins of the host cell system is suppressed, the exogenous 
proteins are labelled, for example by incubation with radioactive amino acids. 

30 Subsequently, all secreted proteins are separated from the host cells comprising the 

individual expression systems. This separation is performed in a manner which allows one 
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to correlate the secreted protein fractions with the host cell/expression system from which it 
derived. For example, the host cells comprising the individual expression systems may be 
grown in semisolid medium, such as soft agar at a density which allows physical separation 
of individual clones or colonies. If the secreted proteins are separated from the cells by a 
5 vertical diffusion process on, for example, a nitrocellulose filter, which may be achieved by 
placing the filter on top of the semisolid medium, the secreted proteins are bound on the 
filter in a pattern which exactly mirrors the physical distribution pattern of the 
corresponding host cells in the semisolid medium. Those host cells/expression systems 
which comprise exogenous secreted proteins can thus easily be identified by presence of the 

10 label in the fraction of secreted proteins. 

Likewise, the methods may be applied to identify and isolate membrane associated 
proteins, nuclear proteins, mitochondrial proteins, lysosomal proteins, Golgi apparatus 
proteins, ER proteins, etc., as described in more detail within the body of this application. 
In short, the above principle may, with modifications, be applied to identify and isolate 

15 exogenous proteins localized in any other predetermined cellular localization of interest. 
Furthermore, the methods and expression systems of the invention may be applied to 
identify and isolate nucleic acids encoding exogenous proteins having a particular 
enzymatic activity, binding affinity to a binding partner, or structure of interest. 

In a second general aspect, the invention provides methods which allow expression 

20 cloning of nucleic acids encoding exogenous proteins for which a predetermined binding 
molecule and substrate (ligand) is available. For example, to identify an unknown secreted 
ligand for a known receptor, receptor protein may be immobilized on a filter, e.g., a 
nitrocellulose filter, in a manner that the filter is saturated with protein, /.e., does not bind 
any further protein molecules. A plurality of expression systems is provided that has been 

25 prepared using expressed nucleic acids from a source known to express the receptor's 

ligand, which can be determined using standard methods. The expression systems are again 
introduced and expressed in suitable host cells in a manner that host cells comprising 
individual expression systems are physically separable in an identifiable way, e.g., by 
growing the cells in semisolid medium. Similarly, as described for the identification of 

30 expression systems comprising secreted proteins, the exogenous proteins are labelled within 
the time window in which endogenous protein expression is suppressed. Subsequently, the 
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secreted proteins are transferred on the nitrocellulose filter which is saturated with the 
receptor protein. Since all binding positions on the filter are saturated with protein, see, 
supra, only a ligand binding to the receptor may adhere to the filter. Exogenous protein 
corresponding to the receptor's ligand may be identified due to its label, and correlated to 

5 the expression system it is derived from. Accordingly, if a receptor for a predetermined 
. ligand is to be cloned, and binding of said labelled ligand to the host cells is detected, 
labelled ligand is subjected to host cells comprising a plurality of expression systems and 
detecting binding of said labelled ligand to the host cells. Individual expression systems 
giving rise to ligand binding may be isolated. 

10 One advantage provided by the expression systems provided by the present 

invention is that they may be introduced and expressed in a wide variety of host cells. The 
expression cloning of receptors and ligands can therefore be performed in a host cell system 
of the same or related species as the receptor and ligand are derived. This is a valuable 
feature in particular in view of the significance of the glycosylation pattern which is known 

15 to be species dependent and might influence receptor ligand binding. 



B. The Process For Identification And Isolation Of Proteins Having A 
Predetermined Property 

1. The Expression System 

One important component for the methods provided by the present 
invention is an expression system having a number of features. To practice the methods of 
the invention, one or a plurality of unique expression systems is generated and introduced 
in eukaroytic host cells leading to a population of host cells, with generally each cell 
containing one unique type of expression system. It will be understood, however that the 
population of host cells can contain some cells having the same expression system. 

Generally, the expression system provided by the present invention is a nucleic acid 
molecule comprising two critical elements. First, it comprises a control element; secondly, 
it comprises a heterologous nucleic acid encoding a recombinant protein, referred to as 
"exogenous protein", which is operatively linked to the control element. 

The Control Element. In its most basic version, the control element comprises 
components which facilitate transcription and translation of the exogenous protein. 
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Preferably, the control element also is or contains a component which promotes, under 
certain circumstances, selective expression of the exogenous protein while expression of 
endogenous proteins in a host is suppressed. The control element may further comprise 
additional components, as detailed below. 
5 A control element useful for the practice of the invention may be derived from a 

number of sources. Most typically, the control element is derived from a virus. A variety 
of viral control elements have been described that are capable of directing expression in 
eukaryotic cells. Particularly preferred control elements are further capable of promoting, 
under certain circumstances, expression of nucleic acids in eukaryotic cells while 

10 expression of endogenous proteins is inhibited. Such viral control elements both already 
described and yet to be discovered are within the scope of the invention. 

Viruses of the group of RNA viruses are preferred vectors for the present invention. 
Most preferably viruses of the group of alpha virus are used as viral vectors, because they 
themselves have the ability to block host protein synthesis. Examples of suitable alpha 

15 viruses are Sindbis virus, Semliki forest virus or Venezuelan equine encephalitis virus. A 
variety of alpha virus derived control elements has been described. See y among other 
places, Liljestrom and Garoff, 1991, Biotechnology 9:356-361; Hahn et al y 1992, Proc. 
Natl Acad, Scl USA 89:2679-2683; Bredenbeek, 1993, J. Virol 67:6439-6446. 

Alpha Viruses are positive-strand RNA viruses of the family of Togaviridae. Strauss 

20 e t al, 1994, Microbiological Reviews pp. 491-562. Alpha viruses can function in a broad 
range of host cells, including mammalian avian, amphibian, reptilian and insect cells. The 
genome of alpha viruses comprises elements capable of directing the expression of proteins 
encoded by nucleic acids of said viral genome in large amounts. 

The expression of proteins encoded by the alpha viral genome is independent of 

25 expression of proteins encoded by the genome of the host cell. Transcription of cellular 
genes may be arrested while the expression of virally encoded nucleic acids is not 
noticeably affected. Moreover, the expression of the viral genome of alpha viruses in a cell 
was found to result in the inhibition of translation of cellular messenger RNA molecules. 
Frolove/a/., 1994,./ Virol. 68:1721-1727. This feature of alpha virus mediated 

30 expression of proteins in cells is particularly useful for the practice of the invention. 
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A number of, e.g., the double subgenomic vectors of the alpha virus group are 
provided, in which a recombinant nucleic acid may be inserted at a site in the genome 
downstream of a second subgenomic promoter or another genetic element leading to 
expression of the exogenous nucleic acid. Hahn et al. y PNAS, 89:2679-2683. It is well 

5 known that for example internal ribosome entry sites are functional in a Sindbis virus 

background. An RNA molecule containing a viral internal ribosome entry site upstream of 
a resistance gene, e.g., a neomycine resistance gene, in the cytoplasma is sufficient to 
survive the selection with G418. Therefore, the viral internal ribosome entry site is 
sufficient for translation. Also, many sites within the genome are functional for expression 

10 of the exogenous nucleic acid. Sindbis virus vectors have been engineered which contain 
the second subgenomic promoter/expression cassette upstream or downstream of the 
structural genes. Frolov et al, 1996, /WAS, 93:1 1371-1 1377. Additionally, for increased 
cloning capacity and increased safety of the vectors, the structural proteins can be supplied 
by helper functions provided either by a packaging cell line or a helper virus particle. 

15 Bredenbeek et aL, 1993, J. Virol, pp. 6433-6446. Removing the structural proteins from 
the replicon increases the cloning capacity of the vector used, which is advantageous for 
cloning of large secreted proteins and glycoproteins. See, infra. As pointed out with these 
examples many modifications of the genetic arrangements are possible within the scope of 
the invention. 

20 Although alpha viral control elements are most preferred, a number of other viral 

control elements may be employed which fulfill the criterion that synthesis of proteins 
under viral regulation can occur when protein synthesis of the host cell is arrested. 
According to the present invention, a variety of methods can be applied to uncouple viral 
protein synthesis from host cell protein synthesis. For example, all eukaroytic genes which 

25 encode proteins are transcribed by RNA polymerase II. Thus, if a control element is used 
which does not use RNA polymerase II for transcription, the host protein synthesis may 
selectively be inhibited using an RNA polymerase II inhibitor; inhibition of transcription 
leads consequently also to inhibition of translation. 

A variety of other RNA polymerase II specific inhibitors have been described and 

30 are useful for the practice of this embodiment of the invention. The skilled artisan would 
readily know which RNA polymerase II specific inhibitor to use. For example, antibiotics 
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have been demonstrated as useful for such inhibition. Such antibiotics are, for example, 
Actinomycin D, Aflatoxin Bl, Amatoxin, which are useful for the practice of this 
embodiment of the invention. 

In another embodiment of the invention, the control element may be derived from a 
5 prokaryotic cell. In that case, the expression system would also include the gene encoding 
bacterial RNA polymerase. Again, host transcription could then be selectively inhibited 
using a RNA polymerase II inhibitor, such as Actinomycin D, Aflatoxin Bl, Amatoxin. 
The skilled artisan would readily know which control element to use in the practice of the 
invention. 

10 In still another embodiment of the invention, control elements derived from 

eukaroytic systems may be used in the expression systems of the invention. For example, 
one may use a promoter or enhancer element specific for RNA polymerase I or III as 
control element. This would render the expression of the nucleic acid encoding the 
exogenous protein dependent upon RNA polymerase I or III, but not on RNA polymerase 

15 II. As in eukaryotic host cells genes encoding proteins have been demonstrated to be 
transcribed by RNA polymerase II, and it is believed in the art that RNA polymerase II is 
solely responsible for such transcription. Therefore, in the practice of this embodiment, 
one may, for example, inhibit RNA polymerase II dependent transcription by adding an 
RNA polymerase II specific inhibitor, thus inhibiting the expression of endogenous proteins . 

20 without such an effect on the expression of the exogenous protein. 

The Nucleic Acid Encoding An Exogenous Protein. As set forth above, the 
expression system of the invention comprises in addition to a control element a 
recombinant nucleic acid encoding an exogenous protein which is operatively linked to the 
control element. The recombinant nucleic acid may be derived from any source, i.e., any 

25 organism, tissue or cell type, disease state, etc. In one embodiment of the invention, a 

plurality of different nucleic acids is inserted into a plurality of expression systems, so that 
a plurality of expression systems is generated each encoding a unique exogenous protein. 
Alternatively, one known or unknown nucleic acid of interest encoding one particular 
exogenous protein to be characterized may be inserted into the expression system. 

30 in one embodiment of the invention, the nucleic acid component encoding an 

exogenous protein may be derived from a nucleic acid library. This embodiment is 
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particularly preferred if the objective is to identify and isolate nucleic acids encoding 
exogenous proteins with a predetermined property of interest. The library may be obtained 
from a tissue or cell type of interest. This library may be a cDNA library, a genomic 
library, an RNA library, a heterologous RNA library, or any other kind of library 
5 comprising transcribed nucleic acid from any kind of organism, tissue, or cell type known 
to the skilled artisan. In preferred embodiments, the library is derived from a mammalian 
source, in most preferred embodiments form a human source; however, it may be also 
derived from reptilian, amphibian, avian, insect, plant, fungi, bacterial cells, etc. In some 
instances, the recombinant nucleic acid will be derived from a subtractive library, for 

10 example a library which comprises cDNAs differentially expressed in a disease state when 
compared to the corresponding healthy tissue. A nucleic acid library typically comprises a 
number of different nucleic acid species, each species having a distinct nucleic acid 
sequence when compared to other species in the library. The number of nucleic acid 
species, or complexity, of a library may vary widely, depending on a number of parameters. 

15 for instance, in case of a cDNA library, the complexity of the library depends on the 
complexity of the RNA pool used to generate the cDNA library. Suitable nucleic acid 
libraries may be generated using standard methods, see, among other places, Sambrook ei 
ai, Molecular Clonine: A Laboratory Manual. 2nd Ed. , Cold Spring Harbor (1989). 
Alternatively, a suitable library may be obtained from numerous commercial sources, 

20 including, but not limited to, Clontech, Palo Alto, California; and Stratagene, La Jolla, 
California. 

Optionally, the genetic library may be propagated in E. coli, phages, yeast or the 

like. 

Alternatively, the nucleic acid encoding the exogenous protein may be derived from 
25 mRNA isolated from a tissue or cell type of interest. In this case, the mRNA is reverse 
transcribed into cDNA. Starting with cellular RNA or, preferably, mRNA, many methods 
are known for the preparation of cDNA. Most preferably directed cDNA is produced 
which can be inserted into the viral vector such that the clones carry the inserts in sense 
orientation. Stratagene, e.g., offers several kits for the generation of directed cDNA 
30 (lambda ZAP cDNA kit). Further, standard protocols for the generation of cDNA are 
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available, and can be found, among other places, in Sambrook et al. y Molecular Cloning: A 
Laboratory Manual. 2nd Ed. . Cold Spring Harbor (1989). 

In another embodiment of the invention, the nucleic acid encoding an exogenous 
protein is a particular and specific nucleic acid of interest. The so generated expression 
5 system is particularly useful for characterization of a particular nucleic acid or protein of 
interest as it conveniently allows to determine where a protein encoded by a nucleic acid 
molecule is located in the cell, thereby revealing important information to the skilled artisan 
as to the function of that protein. 

The nucleic acid of interest may be DNA or RNA. Whether the nucleic acid is 

10 available as DNA or RNA, it can easily be used with a control element as part of the 

expression system of the invention. Once the artisan has chosen a control element useful 
for the method of the invention, it will require only routine experimentation to use the 
nucleic acid of interest with that control element. 

For example, if the nucleic acid of interest is available in the form of RNA and the 

15 control element of choice is available in the form of DNA, the nucleic acid of interest can 
be converted into DNA using routine reverse transcriptase assays. Also, the control 
element could be converted into RNA by using routine transcription assays. Following 
either conversion, the resulting two DNA or RNA molecules can be ligated together using 
routine DNA ligase assays or RNA ligase assays. Finally, the resulting expression system 

20 comprising the nucleic acid of interest and the control element combined, can then be used 
in the invention. Similar to the above example, if the nucleic acid of interest is available in 
the form of DNA and the control element in the form of RNA, either routine conversion 
method can be applied followed by the appropriate routine ligase method. The skilled 
artisan would readily know which procedure for transcription, reverse transcription, or 

25 ligation to follow in order to practice the invention. 

Alternatively, the nucleic acid of interest may be DNA of any kind. This includes, 
but is not limited to, complementary DNA ("cDNA"), genomic DNA, cDNA that was 
manipulated to include additional nucleic acids, or any other kind of DNA. Any kind of 
DNA can be used in the method of the invention. Any kind of DNA can be linked to the 

30 control element molecule by using routine ligation assays. The expression of genomic 
DNA is well within the scope of the invention as the use of eukaryotic cell types facilitates 
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the necessary processing of heterogenous RNA molecules that are created through 
transcription of genomic DNA, to yield messenger RNA useful for cellular protein 
synthesis. 

The nucleic acid encoding the exogenous protein, if derived from a cDNA library or 
5 from RNA or from any other source may be integrated into the expression system using 
standard procedure. Such integration may be facilitated by, for example, DNA ligation or 
RNA ligation procedures to connect a nucleic acid molecule of the library with a control 
element molecule. 

Additionally, different methods of assembly of the cDNA or RNA and the viral 

10 vector are available. Standard ligation of the nucleic acids as DNA are the preferred way to 
carry out the assembly step. Other methods are of course feasible, e.g. y RNA molecules 
can be ligated using the enzyme T4 RNA ligase. Numerous standard methods are available 
and described, for example, in Sambrook et ah , Molecular Cloning: A Laboratory Manual, 
2nd Ed. , Cold Spring Harbor (1989). 

15 Viral Propagation. When using a virally derived control element in the practice of 

the invention, one may want to propagate the virus in order to have the expression system 
available for future experimentation. The propagation of a virus requires the availability of 
certain molecular components to facilitate the assembly of viral particles. Some molecular 
components, in the case of proteins, typically have to be encoded by nucleic acids of the 

20 viral genome, depending on the particular virus. However, one cannot add unlimited 
numbers of nucleic acids to the genome of a virus while maintaining the ability of the 
resulting derivative of the viral genome to become packaged into viral particles. 

Therefore, when trying to maximize the size of a nucleic acid of interest that can be 
incorporated into a viral genome for characterization in the method of the invention, it is 

25 necessary to delete nucleic acids found in the wild type viral genome from that genome. 
However, such deletion may render the resulting expression system incapable of 
propagation of the virus the control element is derived from. 

This problem may, however, be overcome by using a helper function that provides 
for the necessary components for viral propagation. In one embodiment of the invention, 

30 such a helper function may be provided by use of a helper nucleic acid. A helper nucleic 
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acid may be, for example, a defective helper RN A, as used for alpha virus based expression 
systems, to facilitate the propagation of the alpha virus. 

In another embodiment of the invention, such a helper function may be provided 
through a packaging cell line. A packaging cell line encodes in its genome the components 
5 necessary for viral propagation usually encoded by the viral genome, but deleted for 
purposes of cloning utility. These components will therefore be expressed in the cell line 
used in the method of the invention and propagation of the virus will be facilitated. 

The generation of such a packaging cell line is well within the knowledge of the 
skilled artisan by using routine experimentation. For example, depending on the virus 

10 chosen for the practice of the invention, one may take the exact same nucleotide sequence 
deleted from the viral genome, link it to a control element known to facilitate high level 
expression of nucleic acids in the chosen cell line, and transfect the resulting nucleic acid 
into the cell line. An example of such a vector is provided below. Specifically, the vector 
987 BBneo (SEQ ID NO:2) can be transfected into BHK 21 cells with the Ca-phosphate 

15 method and subsequently selected with 200 mg/1 G418 according to known methods. This 
can be further facilitated by, for example, using standard screening procedures for the 
establishment of cell lines with new properties, like, for example, the screen for the 
presence of a marker providing resistance to a certain toxin. Such a marker, like, for 
example, the neomycin resistance gene which renders cells resistant to neomycin selection, 

20 m ay be included in the expression system comprising the nucleic acids encoding 

components necessary for virus propagation and the control element specific for the chosen 
cell line. 



systems comprising library inserts encoding exogenous proteins, are introduced into 
suitable eukaryotic host cells. In this manner, a plurality of eukaryotic host cells can be 
provided, wherein each host cell has an expression system comprising a different member. 
Of course, in this population there may be multiple host cells that contain the same 
member. The host cells can be derived from any kind of organism, depending on the 
compatibility of the expression system. Preferably, the host cell is identical with or related 



2. 



Introduction Of The Expression System In Host Cells 

The expression system, or typically a plurality of unique expression 
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to the cellular source of the recombinant nucleic acids encoding the exogenous protein, 
which, in preferred embodiments is mammal. 

Introduction of the expression systems into eukaryotic host cells can be carried out 
using a number of different well known procedures. Transfection with CaP0 4 , 
5 polyethylenimine, lipofection, electroporation, are only a number of available techniques to 
introduce nucleic acids into animal cells. An RNA molecule containing a viral internal 
ribosome entry site upstream of a resistance gene, e.g., a neomycine resistance gene, in the 
cytoplasma is sufficient to survive the selection with G418. Therefore, the viral internal 
ribosome entry site is sufficient for translation. An advantage of the alpha viruses is their 
10 broad host range; infection of mammalian, insect, avian, amphibian, and reptilian cells 
from various different tissues and different degree of transformation have been reported. 
Cells containing exogenous viral nucleic acids can be used to produce viral particles which 
can be collected from the supernatant or by cell lysis. Alternatively the cells containing the 
exogenous nucleic acids can be used to induce plaque formation in a feeder culture. 

15 

3. The Screening Process For The Identification And Isolation Of 
Exogenous Proteins Having A Predetermined Property 

In order to facilitate screening for individual exogenous proteins, a 

cell culture is provided which allows physical separation of different viral clones and 

20 physical separation of viral particles and/or viral proteins from exogenous novel secreted 
proteins or glycoproteins of interest. 

According to the present invention, the detection of clones secreting novel 
radioactive labelled proteins or glycoproteins can be performed in various ways. Physical 
separation of the viral clones can be achieved for example by plaque formation in semisolid 

25 medium consisting of normal growth medium containing agarose or 

carboxymethylcellulose (similar to phage screening). Alternatively single virus particles 
(isolated by limited dilution) or plaque purified clones can be inoculated into wells of, e.g., 
a 96 well plate containing animal cells. Addition of single cells to a feeder layer in 96 well 
plates will infect the cells with a single clone. Other methods of physical separation of 
viral clones are possible without leaving the scope of the invention. 
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Blocking Of Residual Host Protein Synthesis. Where selective labelling of virally 
encoded exogenous proteins is desired, host cellular protein synthesis has to be silent. 
According to the present invention several methods can be used to suppress cellular protein 
synthesis. Using viruses of the alpha virus group addresses this problem, since the virus 
5 shuts down cellular protein synthesis entirely, see, supra. Alternatively, depending on the 
control element employed, see, supra, if other viruses are used host cell directed RNA 
polymerase II transcription and translation can be suppressed specifically by addition of 
agents such as Actinomycin D, Aflatoxin Bl, Amatoxin. Also in cases where not all cells 
are subject to viral replication the remaining fraction of cells can be 

10 transcriptionally/translationally be arrested with this substance. Condreay et aL, 1988, J. 
Virol, 62:2629-2635. 

The Labelling Step. In a preferred embodiment of the invention, the exogenous 
protein expressed by the expression system is rendered distinguishable from the 
endogenous proteins of the host cell system by specific labelling during a time window in 

15 which endogenous protein expression is suppressed while the exogenous protein is 

expressed. For example, a pulse of radioactively labelled amino acids of suitable length 
and intensity is applied such that sufficient radioactive protein can be recovered from the 
clone for detection. The system may be varied in at least three dimensions, i.e., (1) pulse 
length, (2) pulse intensity, and (3) site of plaques. 

20 At a suitable time after infection when sufficient cells undergo viral replication and 

express virally encoded proteins and when suppression of RNA polymerase II transcription 
and/or translation are complete, a pulse of radioactive amino acids is added to the culture. 
Of course, many different combinations of pulse and chase times, at different plaque sizes, 
choice of isotopes, mode of autoradiography (X-Ray film vs. Phosphoimager) etc. are 

25 suitable. Examples of suitable combinations of labelling times, isotopes, concentrations 
etc. are given in the examples. The application of radioactive amino acids can be carried 
out as a pulse with a subsequent chase period. Other schemes are suitable, e.g., a 
continuous incorporation of radioactive amino acids. Monomeric amino acids are not 
precipitable under certain conditions at which proteins are precipitable and can therefore be 

30 subtracted. Suitable isotopes are 35 S, 3 H, 14 C, in any of the twenty (20) natural amino acids. 
Alternatively endogenous proteins can be treated in a way to allow discrimination with 
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newly synthesized exogenous proteins. Treatments of the present invention are: 1) 
removal of sialic acid residues from the cell surface by sialidase. New membrane proteins 
being synthesized from the exogenous expression system display sialic acid residues on the 
cell surface allowing detection; 2) proteolytic digestion of endogenous proteins (e.g., on the 
5 cell surface). Other methods are possible and within the scope at the present invention. 

Soluble Viral Proteins. When using an expression system with a control element 
derived from a virus comprising viral proteins, one may encounter the problem of soluble 
viral proteins that are expressed when the exogenous protein is expressed. This problem 
may be of varying significance. For example, when using the semisolid medium, e.g., soft- 

10 agar-plate, based screening method, such soluble viral proteins could bind the membrane 
designed to bind the exogenous protein. 

In one embodiment of the invention, such soluble viral proteins are kept from 
interfering with the screen for desired exogenous proteins by using a filter that can bind or 
stop the flow of such soluble viral proteins. In another embodiment, antibodies specific to 

15 the soluble viral proteins are attached to such filter papers so to bind the soluble viral 

proteins. A variety of filter papers useful to bind or stop the flow of soluble viral proteins 
has been described. The skilled artisan would readily know which filter to use for the 
practice of the invention. It is also well within the knowledge of the skilled artisan to 
attach antibodies specific for soluble viral proteins to such filters. In another embodiment 

20 of the invention, antibodies are crosslinked to the semisolid medium (e.g., NuFix Agarose, 
FMC Bioproducts Rockland USA, which prevents interfering of the viral proteins with the 
screen. 

In another embodiment of the invention, the generation of soluble viral protein is 
reduced by using serum containing medium. Any amount of reduction is desirable, 

25 preferably to levels that allow detection of secreted proteins of interest in the background of 
viral proteins. Yet another way of reducing the generation of soluble viral proteins is by 
using a protease inhibitor or inhibitors. An example of using protease inhibitors to reduce 
the amounts of soluble viral proteins is illustrated below by way of a working example. 
Any one or number of protease inhibitors can be used and are well known to those of 

30 ordinary skill in the art. 
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ln another embodiment of the invention, the generation of soluble viral protein is 
reduced by using an expression system containing a control element that is based on a 
mutant virus. A variety of mutant virus strains has been described which are useful for this 
embodiment of the invention. 
5 The skilled artisan would readily know which mutant viral strain to use for this 

embodiment. For example, mutant viral strains of the genus alpha virus have been 
described that are useful for the practice of this embodiment of the invention. A mutant 
Sindbis Virus strain useful for the practice of this embodiment of the invention having the 
genotypes, for example, ts20, tslO, and ts23. Lindquist et al., 1986, Virology, 151 :10-20: 

10 Arias et al., 1983, J. Mol Biol, 168:87-102; Carleton and Brown, 1996, J. Virol, 70:952- 
959. Such temperature-sensitive virus mutants are defective in cleavage of PE2 (ts20) and 
defective in folding of El (tslO; ts23). As a consequence, the viral proteins do not reach 
the cell surface at the nonpermissive temperature. Leakage of soluble proteins is reduced 
by use of these mutants. 

15 in another embodiment of the invention, the generation of soluble viral proteins is 

reduced using a cell line deficient in the proteolysis of viral proteins in the method of the 
invention. A variety of cell lines deficient in the proteolysis of viral proteins has been 
described. Watson et al, 1991, J. Virol 65:2332-2339. 

The skilled artisan would readily know which cell line deficient in proteolysis of 

20 viral proteins would be useful for the practice of the invention. A cell line useful for the 
practice of this embodiment of the invention would be deficient in the proteolytic cleavage 
of proteins of the virus used in the method of the invention. For example, when using an 
Alpha Virus in the practice of the invention, cell lines described that are deficient in the 
cleavage of the viral proteins PE2 to E2 and E3 may be useful. 

25 More particular screening processes of the invention are exemplified below. 

Compartment Screening. In one embodiment of the invention, host cells 
comprising unique expression systems encoding exogenous proteins may be screened by 
placing them into separate compartments. For example, individual cells may be picked 
physically and placed into a separate compartment, for example, a well in a 96 well plate. 

30 Of course, if desired, a number of cells may be placed into a compartment or well, to 
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facilitate the screening of cells in groups. Alternatively, viral particles are diluted to 1 pfu 
per compartment leading to monoclones. 

Semisolid Medium Screening. In another embodiment of the invention, cells used 
in the method of the invention may be separated by growing them in soft agar. For 
5 example, cells can be transfected with the nucleic acids of the expression system and then 
spread out onto plates in a semisolid medium, i.e., soft agar. This soft agar, or semisolid 
medium may consist of, for example, normal growth medium containing agarose or 
carboxymethylcellulose. Alternatively, a confluent layer of cells is infected with virus, 
e.g., alphavirus at an MOI that allows plagues to grow to a size of 1 -3mm. Three hours 

10 postinfection the virus solution is removed and replaced by semi-solid medium. This 

semisolid medium inhibits diffusion of virus particles and allows plaque formation. Other 
methods of maintaining cells in a semi solid medium known to the skilled artisan are also 
within the scope of the invention. 

Once plated out in the soft agar, or semisolid, medium, the host cells can then be 

15 screened. This allows one to reduce lateral diffusion of substances on the plate and 

therefore reduces cross contamination of the various cells or viruses that contain expression 
systems expressing different exogenous proteins. In addition, this experimental setup 
allows the flow of proteins secreted by the cells, for example, extracellular proteins 
encoded by the nucleic acid of interest, from the ceils to a filter that can bind such proteins. 

20 Filters useful for the binding of proteins are, for example, nitrocellulose filters and PVDF 
or nylon membranes. 

Once exogenous proteins are bound to a filter, they may be screened by different 
methods known to the skilled artisan. In one embodiment of the invention, exogenous 
proteins are selectively labelled, see supra, to determine their location on the filter. For 

25 example, by exposing the filter to a film, the film, once developed, will have a dark spot in 
the place where it was juxtaposed to a spot of the filter that had bound labelled exogenous 
protein. When using proper markings, one may correlate each spot on the film to a 
particular area on the soft agar plate. Thus, one may locate the cells that express the 
exogenous protein which gave rise to a spot on the film. Therefore, one may also locate the 

30 particular expression system that contains the nucleic acid encoding the exogenous protein 
that was secreted and gave rise to the spot on the film. 
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In another embodiment of the invention, one may screen filters with bound 
exogenous protein by exposing the filters to a ligand. For example, one method of choice 
may be to expose the filter with the bound exogenous proteins to a labelled lectin, thereby 
screening for secreted glycoproteins. Or, for example, one may screen the filter with an 
5 antibody specific to a particular post-translational modification such as, for example, a 
sugar moiety. Also, one may use antibodies that are specific to a particular protein of 
interest. 

Furthermore, this system facilitates the easy separation of viral particles from the 
exogenous proteins. Separation can be accomplished by, for example, using a different 
10 filter paper, z.e., one that binds or stops the flow of viral particles but not the exogenous 
protein. This additional filter paper may be placed underneath the filter that binds the 
exogenous protein. Therefore, the viral particles would not reach the filter paper that binds 
the exogenous protein. 

Filters can be rendered more effective by, for example, attaching antibodies against 
15 the soluble viral proteins to the filter which bind the soluble viral proteins but not the 
exogenous protein. 

Detection And Isolation Of Secreted Proteins. In a preferred embodiment of the 
invention, samples are prepared for detection of secreted radioactive protein and detecting 
radioactivity versus background. 

20 According to the method of physical separation of the clones chosen, different 

modes of detection of the secreted protein or glycoprotein can be applied. For example, in 
cases where single clones were inoculated in microtiterplates, precipitation of the secreted 
radioactive protein or glycoprotein in the supernatant can be carried out with, e.g., the TCA 
method. (Ma et al 9 1994, Cancer Chemother. Pharmacol 38(4):39 1-394). If physical 

25 separation of the clones was achieved using semisolid growth medium, then the membrane 
capturing the secreted radioactive protein or glycoprotein is washed and prepared for 
autoradiography. Many different commercially available membranes or filters are useful as 
tracer membranes. Protein captured at the membrane has to be bound such that the clones 
secreting radioactive protein can be detected. Several different washing procedures can be 

30 applied to link precipitable protein to the membrane and remove unincorporated radioactive 
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amino acids. Several methods are suitable for autoradiography including exposure with or 
without intensifying agents to an X-Ray film and Phosphoimaging. 

Ligand Selection. In another embodiment of the invention, cells containing 
expression systems expressing different exogenous proteins may be separated by ligand 
5 recognition. For example, when screening for nucleic acids of interest encoding receptor 
molecules of the cellular membrane, a ligand capable of recognizing such receptors may be 
used to screen for cells that express such receptors. That screen may be carried out by, for 
example, attaching the ligand to a solid support and exposing cells expressing the protein of 
interest to the ligand on the solid support. By binding cells that express membrane receptor 

10 molecules of interest, the cells expressing such receptors may be separated from cells not 
expressing such receptors. 

In one embodiment of the invention, the ligand may be a protein. In another 
embodiment, the ligand may be an antibody. In yet another embodiment, the ligand may be 
a lectin. In a further embodiment, the ligand may be a non-protein. The choice of the 

15 Hgand is dependent upon the protein that is to be selected for. A large number of ligands 
capable of binding proteins based on the peptide backbone, a post-translational 
modification, or other criteria have been described. The skilled artisan would readily know 
which ligand is useful for the practice of this embodiment of the invention. 

In another embodiment of the invention, the ligand is attached to a solid support. A 

20 variety of solid supports useful for the practice of this embodiment have been described. 
For example, one may attach the ligand to a plate to which one would add the cells 
expressing exogenous protein. Once on the plate to which the ligand is bound, cells will 
bind to the ligand if they express a membrane protein which recognizes the ligand. Cells 
bound to the ligand will therefore remain on the plate when a washing solution is added for 

25 the removal of cells that do not bind to the ligand. This way, cells expressing a membrane 
receptor which binds the ligand will be removed from cells that do not. Consequently, this 
is a fast and simple way to identify membrane receptors and the nucleic acids that encode 
them. As membrane receptors are main targets for drug development, this embodiment of 
the invention will be highly useful for pharmaceutical research and drug discovery. 

30 Membrane Protein Selection. In another embodiment of the invention, one may 

select for cells expressing an exogenous protein based on the exogenous protein being a 
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membrane protein with an extracellular domain. For example, one may express exogenous 
proteins in cells while expression of endogenous proteins is inhibited or while using an 
expression system the operation of which inhibits the expression of endogenous proteins, 
see, supra. 

5 Using this setup, one may, for example, treat the cells with proteases after inhibition 

of expression of endogenous proteins has set in, but while expression of exogenous protein 
is still continuing. After the cell surface has been deprived of extracellular proteinaceous 
protrusions, i.e., cellular membrane receptors and so on, the expression of the exogenous 
protein may replenish such protrusions. However, such replenishment will only occur if 

10 the exogenous protein expressed in a particular cell is a membrane receptor molecule. 

Therefore, at this step, only cells containing an expression system that encodes a 
membrane receptor molecule will have extracellular proteinaceous protrusions. Such 
protrusions may be used to bind the cell to any structure that binds proteins, regardless of 
whether such binding occurs specifically or nonspeciflcally with regard to the structure of 

15 the protein that is bound. Cells bound to such a protein binding structure, may then, for 
example be further analyzed as to the sequence of the nucleic acid encoding the exogenous 
protein. A variety of structures that bind proteins have been described and are well known 
to the skilled artisan. For example, nitrocellulose, PVPF or nylon, filters may be used in 
this embodiment as a protein binding structure. Other structures, known to the skilled 

20 artisan, that bind proteins are also within the scope of the invention. 

Membrane Glycoprotein Screening. In one embodiment of the invention, cell 
membrane glycoproteins are identified using the method of the invention. To facilitate the 
identification of cell membrane glycoproteins, a variety of method have been described that 
are useful for the practice of this embodiment of the invention. For example, one may alter 

25 the chemical reactivity of cell surface sugar residues and thereby facilitate the specific 

recognition and identification of such sugar residues. One method useful for such alteration 
of chemical reactivity is, for example, the introduction of reactive ketone groups into cell 
surface oligosaccharides. Mahal et al^ 1997, Science 276 : 1 125-1 128. In this method one 
introduces new chemically reactive groups into cell surface glycoproteins. When this 

30 method is applied under conditions where the exogenous protein is expressed in the method 
of the invention while the expression of endogenous proteins is inhibited, the exogenous 
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protein will preferably incorporate the reactive group, thereby facilitating the recognition of 
the exogenous protein. Other methods known to the skilled artisan useful for the practice 
of this embodiment of the invention and are also within the scope of the invention. 

In another embodiment of the invention, membrane glycoproteins are identified 

5 using the method of the invention through selective glycosylation of such membrane 
glycoproteins. A variety of methods are known to selectively glycosylate membrane 
proteins and are within the scope of the invention. For example, one may first remove all 
cell surface sialic acid residues by using an enzyme exhibiting sialidase activity. After such 
removal, one may, using the method of the invention, express the exogenous protein in 

10 cells while the expression of endogenous proteins is inhibited. Thereby, the exogenous 
protein will be preferably transported in sialiated form to the cell surface which facilitates 
its improved identification. 

Screening For Proteins Located In Other Cellular Fractions Of Interest In one 
embodiment, the present invention provides a general approach with which expression 

15 systems comprising a nucleic acid encoding an exogenous protein located in any cellular 
location of interest may be identified. Host cells harboring expression system, comprising a 
unique nucleic acid encoding an exogenous protein are provided in a manner that physical 
separation of each host cell comprising an individual expression system is possible. 
Preferably, the expression systems comprise viral control elements. Most preferably, the 

20 host cell/expression system allows that viral particles comprising the expression system be 
released from the host cells. 

For example, host cells may be plated in semisolid medium. At first, filter which is 
capable of binding viral particles is placed in the semisolid medium under conditions and 
for a duration which allow to captive viral particles on the filter. The skilled artisan will 

25 readily know what filters may be useful for production of this embodiment of the invention. 
For example, the filter may be a nitrocellose filter. The first filter having the virus particles 
attended to it is treated in a way which facilitates recovery of infectious virus particles 
comprising an expression system. Subsequently, the host cells in the semisolid medium are 
exposed to a composition which allows labelling of the expressed exogenous protein which 

30 expression of endogenous proteins is inhibited. For example, the exogenous proteins may 
be labelled with radioactive amino acids while endogenous protein expression is 
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suppressed. In a next step, host cells in the semisolid medium are and/or conditions which 
allow recovery of the cellular fraction/compartment of interest. The skilled artisan will 
know what conditions would be appropriate depending on the cellular 
compartment/fraction under consideration. In a next step, a second filter having attached to 
it, e.g., antibodies against certain structures comprised within the cellular fraction of 
interest is placed on the semisolid medium. It is critical that the filter is saturated, i.e., does 
not allow for any further molecules to bind to its surface. The second filter is placed under 
conditions and for a duration which allows the predetermined cellular compartment of 
interest to adhere. If, for example, the cellular component of interest is in the nucleus, 
antibodies to nuclear membrane surface proteins would be attached to the filter. For other 
compartments, corresponding methods would be used. The second filter thus constitutes a 
"replica" of the first filter, where the first filter has attached to it the recoverable expression 
system, while the second filter has attached to it proteins derived from the cellular 
component of interest. Those individual clones of host cells which comprise an expression 
system comprising a nucleic acid encoding an exogenous protein located within the cellular 
fraction/component of interest may be identified through presence of the label. 
Additionally, cellular compartments can be physically separated using classical methods 
well-known in the art (e.g., differential centrifugation). 

Identification Of Unknown Ligands Of Known Receptors. In one embodiment, 
the invention allows for the identification of unknown secreted ligands of known receptor 
of interest. Host cells comprising expression system comprising unique recombinant 
nucleic acids encoding exogenous proteins are cultured in a way that they are a physically 
separable. For example, for this purpose the host cells may be cultured in semisolid 
medium. The cells are incubated with compositions which allow labelling of expressed 
exogenous proteins while expression of endogenous proteins is inhibited. A filter coated 
with the receptor protein of interest whereby all binding sites are saturated (either by the 
receptor protein or by any other neutral protein) is placed on the semisolid medium under 
conditions and for a duration which allows all secreted proteins to move to the filter. Since 
all protein binding sites on the filter are saturated, only secreted proteins binding to the 
receptor of interest may adhere to the filter. Those exogenous proteins may be identified 
by means of their label. As the filters represent a replica, i.e., a mirror image of the 
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original plate, colonies harboring the expression system comprising a nucleic acid encoding 
a ligand of the receptor of interest may be easily identified. 

Identification of Unknown Receptors Of Known Ligands. In one embodiment, the 
invention allows for the identification of unknown receptors for a known ligand. Host cells 

5 comprising expression system comprising unique recombinant nucleic acids encoding 

exogenous proteins are cultured in a way that they are physically separable. For example, a 
confluent layer of cells can be infected with a low MOI of an infectious virus carrying the 
plurality of expression systems. After infection, the cells are incubated at 34°C in 
semisolid medium with a melting point around 37°C (e.g., low melting agarose). After 

10 plaques of sufficient size are produced, a filter is placed on top long enough to capture viral 
particles which serves as a replica of the plate. Subsequently, the agarose is melted by 
heating the plates to 39°C. After removal of the agarose and washing, the cells are 
incubated with the labelled ligand of choice. After washing of the cells, the binding of the 
ligand can be detected by autoradiography. Viral particles containing the expression 

15 system (and nucleic acid) of interest can then be recovered from the replica filter. 



4. Recovery Of Viral Particles Comprising Exogenous Proteins Of 
Interest 

In a next step, those viral particles from the samples or area where 
20 increased radioactivity was observed are identified and recovered. The identity of the well 
or the position of the plaques where exogenous protein of interest was detected can be 
traced back using the developed film and viral particles can be recovered. At this point, 
recombinant virus particles containing a nucleic acid sequence giving rise to expression of 
the protein of interest are collected. According to the invention the particles can be applied 
25 for, e.g., the one of the procedures described below. 

C. Applications Of The Identified Expression Systems Encoding Proteins 
Of Predetermined Properties 

The pools of expression systems selected with the above screening methods 
having a predetermined property, e.g., cellular localization, structure, enzymatic function, 

30 

or affinity to other molecules, may be used for further screening procedures. Alternatively, 
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individual expression systems may be randomly picked and subjected to characterization 
and functional analysis. See, infra. 

Additional Screening Rounds To Avoid Contamination. Individual clones may be 
picked and the procedure of steps 3 and 4 can be repeated to plaque purify the clones and to 
5 avoid contaminations. 

Passaging The Viral Particles To Increase Their Number. The viral particles 
isolated and optionally amplified by passaging can be applied to one of the following 
procedures. Many useful application for viral particles containing exogenous nucleic acids 
encoding secreted proteins or glycoproteins are apparent, of which some are describe 

10 below. Therefore the present invention is not limited to the applications described above. 

Infection Of A Cell Culture To Induce Production Of Novel Secreted Protein Or 
Glycoprotein, In Either Labelled Or Unlabelled Form. Novel secreted proteins or 
glycoproteins can be produced at larger scale by infecting a substrate cell line growing in 
vessels such as T-flasks, spinner flask, roller bottles, bioreactors, perfusion reactors and the 

15 like with the isolated viral particles. Depending on the nature of the genetic construct 
chosen above, production will occur under slightly different conditions. When one way 
vectors are packaged in a packaging cell line the particles will induce only one round of 
infection (accordingly the MOI should be higher than one). Also a broad spectrum of host 
cell lines can be used for production of the novel secreted protein or glycoprotein. 

20 Labelling of the novel secreted protein or glycoprotein can be achieved by addition of 
radioactive amino acids as described above. This offers the advantage that product 
formation can easily be monitored by SDS polyacrylamide gel electrophoresis and 
subsequent autoradiography. In this way an unknown product can be easily detected for 
optimization of production parameters. 

25 Application Of The So Produced Supernatant To Protein Purification Or 

Isolation Methods And Detecting Novel Secreted Protein Or Glycoprotein By Its 
Radioactive Label. The novel secreted protein or glycoprotein can easily be purified from 
the supernatant of the above step using standard protein purification techniques. The 
unique possibility to label the secreted protein during production, see supra, simplifies 

30 purification processes significantly. Eluent from chromatography columns or the like can 
automatically be collected using a sample collector. The samples containing the novel 
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secreted protein can then easily be identified by detection of radioactivity. Enriched and 
purified material can be obtained in this way. Unlabelled material can be obtained by 
application of identical chromatography conditions to unlabelled cell culture supernatant. 

D. Characterization And Functional Analysis Of Individual Identified 
Proteins 

Several procedures known in the art may be employed for further 
characterization arid functional analysis of individual identified exogenous proteins. 

1. Sequencing of The Nucleic Acid Encoding The Exogenous 
Protein 

In one embodiment, the identity and the sequence of the exogenous 
nucleic acid in the recovered viral particles can be investigated by sequencing according to 
standard methods. Detailed description of suitable protocols can be found, for example, in 
Sambrook et ah, Molecular Cloning: A Laboratory Manual, 2nd Ed. , Cold Spring Harbor 
(1989). 

Sequence comparisons with known polynucleotide sequences in data bases may 
reveal indications about the function of an identified and isolated exogenous protein. 



2. Analysis Of The Expression Pattern Of The Identified Exogenous 
20 Protein 

In one embodiment, the cDNA of the identified exogenous protein or 

fragments thereof are used as a probe to detect the expression of its mRNA. For example, 

sections of tissue samples may be prepared and examined by in situ hybridization with a 

suitable, labelled probe. Alternately, mRNA extracts may be prepared and analyzed in 

25 Northern blot analysis. Alternatively, synthetic oligonucleotides designed according to the 
identified protein's cDNA sequence may be generated and used as hybridization probes. 
Detailed description of suitable protocols can be found, for example, in Sambrook et al. , 
Molecular Cloning: A Laboratory Manual, 2nd Ed. , Cold Spring Harbor (1989). 

In one embodiment, the level of the gene f s expression is assayed by detecting and 

30 measuring its transcription. For example, RNA from a cell type or tissue known," or 
suspected to over- or under- express the gene, such as cancerous tissue, is isolated and 
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tested utilizing hybridization or PCR techniques such as are described herein. The isolated 
cells can be derived from cell culture, from a patient, or from test animals. Such analyses 
may reveal both quantitative and qualitative aspects of the expression pattern of the gene 
encoding the identified exogenous protein, including activation or inactivation of its gene 
5 expression. 

Hybridization probes for Northern blot, Southern blot, and in situ hybridization may 
be labelled by a variety of reporter groups, including radionuclides such as 32 P, 35 S, and 3 H 
(in the case of in situ hybridization), or enzymatic labels, such as alkaline phosphatase, 
coupled to the probe via avidin/biotin coupling systems, and the like. The labelled 

10 hybridization probes may be prepared by any method known in the art for the synthesis of 
DNA and RNA molecules. See, Section VI.H., infra. An additional use for nucleic acid 
hybridization probes involves their use as primers for polymerase chain reaction (PCR). 
PCR is described in detail in U.S. Patents 4,965,188, 4,683,195, and 4,800,195. 

DNA may be used in hybridization or amplification assays of biological samples to 

15 detect abnormalities involving gene structure, including point mutations, insertions, 

deletions and chromosomal rearrangements. Such assays include, but are not limited to, 
Southern analyses, single stranded conformational polymorphism analyses (SSCP), and 
PCR analyses. 

Diagnostic methods for the detection of specific mutations of the gene encoding the 
20 identified exogenous protein can involve for example, contacting and incubating nucleic 
acids including recombinant DNA molecules, cloned genes or degenerate variants thereof, 
obtained from a sample, e.g., derived from a patient sample or other appropriate cellular 
source, with one or more labelled nucleic acid reagents including recombinant DNA 
molecules, cloned genes or degenerate variants thereof, under conditions favorable for the 
25 specific annealing of these reagents to their complementary sequences within the gene. 

Preferably, the lengths of these nucleic acid reagents are at least 15 to 30 nucleotides. After 
incubation, all non-annealed nucleic acids are removed from the nucleic acid molecule 
hybrid. The presence of nucleic acids which have hybridized, if any, is then detected. 
Using such a detection system, the nucleic acid from the cell type or tissue of interest can 
30 be immobilized, for example, to a solid support such as a membrane, or a plastic surface 
such as that on a microtiter plate or polystyrene beads. In this case, after incubation, non- 
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annealed, labelled nucleic acid reagents are easily removed. Detection of the remaining, 
annealed, labelled gene's nucleic acid reagents is accomplished using standard techniques 
well-known to those in the art. The sequences to which the nucleic acid reagents have 
annealed is compared to the annealing pattern expected from a normal gene sequence in 

5 order to determine whether a gene mutation is present. 

Alternative methods for the detection of the gene's specific nucleic acid molecules, 
in patient samples or other appropriate cell sources, may involve their amplification, e.g., 
by PCR (the experimental embodiment set forth in Mullis, K.B., 1987, U.S. Patent No. 
4,683,202, see, supra), followed by the detection of the amplified molecules using 

10 techniques well known to those of skill in the art. If mutations are intended to be 

determined, the resulting amplified sequences can be compared to those which would be 
expected if the nucleic acid being amplified contained only normal copies of the gene in 
order to determine whether a gene mutation exists. 

15 3. Detection Of The Identified Exogenous Protein Using Antibodies 

Antibodies directed against the identified exogenous protein of 
interest or conserved variants or peptide fragments thereof, may also be used to gain more 
insight of its expression pattern in vivo. Antibodies may also be used to detect 
abnormalities in the level of the gene's expression, or abnormalities in the structure and/or 

20 temporal, tissue, cellular, or subcellular location of its gene product, and may be performed 
in vivo or in vitro, such as, for example, on biopsy tissue. 

The analysis may be performed on any tissue or cell type, or, with labelled 
antibodies, even in vivo in a test animal. Alternatively, the tissue or cell type to be analyzed 
may include those which are known, or suspected, to aberrantly express the identified 

25 exogenous protein of interest, such as, for example, cancerous tissue. 

The protein isolation methods employed herein may, for example, be such as those 
described in Harlow and Lane (Harlow, E. and Lane, D., 1988, " Antibodies: A Laboratory 
Manual ". Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York). The 
isolated cells can be derived from cell culture, a laboratory animal, or a patient. 

30 For example, antibodies, or fragments of antibodies useful in the present invention 

may be used to quantitatively or qualitatively detect the presence of the identified 
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exogenous protein of interest or conserved variants or peptide fragments thereof. This can 
be accomplished, for example, by immunofluorescence techniques employing a 
fluorescently labelled antibody (see, this Section, infra) coupled with light microscopic, 
flow cytometric, or fluorimetric detection. 

5 The antibodies (or fragments thereof) or fusion or conjugated proteins useful in the 

present invention may, additionally, be employed histologically, as in immunofluorescence, 
immunoelectron microscopy or non-immuno assays, for in situ detection of the identified 
exogenous protein of interest or conserved variants or peptide fragments thereof, or for 
catalytic subunit binding (in the case of labelled catalytic subunit fusion protein). 

10 in situ detection may be accomplished by removing a histological specimen from a 

patient, and applying thereto a labelled antibody or fusion protein of the present invention. 
The antibody (or fragment) or fusion protein is preferably applied by overlaying the 
labelled antibody (or fragment) onto a biological sample. Through the use of such a 
procedure, it is possible to determine not only the presence of the identified exogenous 

15 protein of interest, or conserved variants or peptide fragments, but also its distribution in 
the examined tissue. Using the present invention, those of ordinary skill will readily 
perceive that any of a wide variety of histological methods (such as staining procedures) 
can be modified in order to achieve such in situ detection. 

Immunoassays and non-immunoassays for identified exogenous protein of interest 

20 or conserved variants or peptide fragments thereof will typically comprise incubating a 

sample, such as a biological fluid, a tissue extract, freshly harvested cells, or lysates of cells 
which have been incubated in cell culture, in the presence of a detectably labelled antibody 
capable of identifying the identified exogenous protein of interest or conserved variants or 
peptide fragments thereof, and detecting the bound antibody by any of a number of 

25 techniques well-known in the art. 

The biological sample may be brought in contact with and immobilized onto a solid 
phase support or carrier such as nitrocellulose, or other solid support which is capable of 
immobilizing cells, cell particles or soluble proteins. The support may then be washed with 
suitable buffers followed by treatment with the detectably labelled antibody or fusion 

30 protein. The solid phase support may then be washed with the buffer a second time to 
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remove unbound antibody or fusion protein. The amount of bound label on solid support is 
then detected by conventional means. 

By "solid phase support or carrier" is intended any support capable of binding an 
antigen or an antibody. Well-known supports or carriers include glass, polystyrene, 

5 polypropylene, polyethylene, dextran, nylon, amylases, natural and modified celluloses, 
polyacrylamides, gabbros, and magnetite. The nature of the carrier can be either soluble to 
some extent or insoluble for the purposes of the present invention. The support material 
may have virtually any possible structural configuration so long as the coupled molecule is 
capable of binding to an antigen or antibody. Thus, the support configuration may be 

10 spherical, as in a bead, or cylindrical, as in the inside surface of a test tube, or the external 
surface of a rod. Alternatively, the surface may be flat such as a sheet, test strip, etc. 
Preferred supports include polystyrene beads. Those skilled in the art will know many 
other suitable carriers for binding antibody or antigen, or will be able to readily ascertain 
the same. 

15 The binding activity of a given lot of antibody or fusion protein is determined 

according to well known methods. Those skilled in the art will be able to readily determine 
operative and optimal assay conditions. 

With respect to antibodies, one of the ways in which the antibody can be detectably 
labelled is by linking the same to an enzyme and use in an enzyme immunoassay (EIA) 

20 (Voller, 1 978, Diagnostic Horizons 2: 1 -7, Microbiological Associates Quarterly 

Publication, Walkersville, MD); Voller et al. 9 1978, J. Clin. Pathol. 31:507-520; Butler, 
1981, Meth. Enzymol 73:482-523; Maggio, E. (ed.), 1980, Enzyme Immunoassay , CRC 
Press, Boca Raton, FL,; Ishikawa et al^ (eds.), 1981, Enzyme Immunoassay , Kgaku Shoin, 
Tokyo). The enzyme which is bound to the antibody will react with an appropriate 

25 substrate, preferably a chromogenic substrate, in such a manner as to produce a chemical 
moiety which can be detected, for example, by spectrophotometry, fluorimetric or by 
visual means. Enzymes which can be used to detectably label the antibody include, but are 
not limited to, malate dehydrogenase, staphylococcal nuclease, delta-5-steroid isomerase, 
yeast alcohol dehydrogenase, alpha-glycerophosphate, dehydrogenase, triose phosphate 

30 isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, 
beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, 
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glucoamylase and acetylcholinesterase. The detection can be accomplished by colorimetric 
methods which employ a chromogenic substrate for the enzyme. Detection may also be 
accomplished by visual comparison of the extent of enzymatic reaction of a substrate in 
comparison with similarly prepared standards. 

5 Detection may also be accomplished using any of a variety of other immunoassays. 

For example, by radioactively labelling the antibodies or antibody fragments, it is possible 
to detect the identified exogenous protein of interest through the use of a radioimmunoassay 
(RIA) (see, for example, Weintraub, B., Principles of Radioimmunoassays , Seventh 
Training Course on Radioligand Assay Techniques, The Endocrine Society, March, 1986). 

10 The radioactive isotope can be detected by such means as the use of a gamma counter or a 
scintillation counter or by autoradiography. 

It is also possible to label the antibody with a fluorescent compound. When the 
fluorescently labelled antibody is exposed to light of the proper wave length, its presence 
can then be detected due to fluorescence. Among the most commonly used fluorescent 

15 labelling compounds are fluorescein isothiocyanate, rhodamine. phycoerythrin, 
phycocyanin, allophycocyanin, o-phthaldehyde and fluorescamine. 

The antibody can also be detectably labelled using fluorescence emitting metals 
such as ,52 Eu, or others of the lanthanide series. These metals can be attached to the 
antibody using such metal chelating groups as diethylenetriaminepentacetic acid (DTP A) or 

20 ethylenediaminetetraacetic acid (EDTA). 

The antibody also can be detectably labelled by coupling it to a chemiluminescent 
compound. The presence of the chemiluminescent-tagged antibody is then determined by 
detecting the presence of luminescence that arises during the course of a chemical reaction. 
Examples of particularly useful chemiluminescent labelling compounds are luminol, 

25 isoluminol, theromatic acridinium ester, imidazole, acridinium salt and oxalate ester. 

Likewise, a bioluminescent compound may be used to label the antibody of the 
present invention. Bioluminescence is a type of chemi luminescence found in biological 
systems in, which a catalytic protein increases the efficiency of the chemiluminescent 
reaction. The presence of a bioluminescent protein is determined by detecting the presence 

30 of luminescence. Important bioluminescent compounds for purposes of labelling are 
luciferin, luciferase and aequorin. 
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Infection Of Animals Tissues Or Cells. The viral particles obtained by the 
procedure described above can be used as gene transfer vectors to deliver the gene of 
interest to a tissue or animal. This approach is useful for the investigation of the nature of 
the protein of interest or could be used, e.g., for the production of antibodies against the 
5 novel protein as a result of its expression at the host. 

E. Uses Of Proteins Having A Predetermined Property 

The nucleic acids which are identified, characterized and isolated using the 
methods and expression systems of this invention, and, more importantly, proteins encoded 

10 by same, may be useful for a number of applications. First, in many instances they will be 
useful for research applications and laboratory use, for example the discovery and isolation 
of new growth factors, cytokines, or hormones may facilitate the growth of cells in culture 
which could not be cultured before. The discovery and isolation of membrane receptors, 
cytoplasmic, and nuclear proteins will be useful to gain more insight in important cellular 

15 signal transduction and control processes. The discovery and isolation of organelle proteins 
may provide more insight into metabolic, anabolic, and processing functions. 

However, some of the genes and gene products identified and isolated by the present 
invention may directly be used as therapeutic agents or, alternatively, as therapeutic targets. 

20 

1. Use As Therapeutic Agents 

Proteins, in particular the secreted proteins, identified and isolated 
with the expression systems of the present invention may be useful as therapeutic agents. 
In fact, a number of the already known secreted proteins, including cytokines and peptide 

25 hormones, are manufactured and used as therapeutic agents. Many severe diseases are 
caused by lack or insufficient amounts of certain secreted proteins which serve as 
intercellular communicators, for example in response to environmental changes or other 
physiological needs. A well known example is diabetes mellitus, which most frequently is 
caused by deficiencies in the production of the peptide hormone insulin. Many other 

30 examples are known. In light of the fact that only a small percentage of all secreted 
proteins have been identified, isolated and characterized thus far, it can be expected that 
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many of the novel secreted proteins uncovered by the present invention will be useful as 
therapeutic compounds for the treatment of diseases, including, but not limited to, diseases 
relating to aberrant cell proliferation, metabolic signals, or certain other incapabilities of 
cells to communicate properly. 

5 In addition, proteins which are derived from other cellular localizations may be 

useful as therapeutic agents. For example, a number of proteins expressed in the nucleus, 
including apoptosis-inducing proteins or tumor suppressors, are prone to be used as agents 
to treat, e.g., cancer. 

The skilled artisan will be able to determine which proteins identified using the 

10 methods and expression systems of the invention may be useful as therapeutic targets. For 
example, in vivo assays using inhibitory antibodies or antisense strategies may be employed 
to elucidate the function of a protein. Further, purified forms of the protein may be used 
for cellular assays and tests in animal disease models to determine their value as therapeutic 
compound. 

15 

2. Use As Therapeutic Targets 

Another important aspect of the invention is to provide proteins 
which serve as specific therapeutic targets for the treatment pathological disorders. While 
many diseases relating to inappropriate function or abundance of secreted proteins may be 

20 treated by administration of the respective protein (or an equivalent thereof), the situation is 
often different for proteins located within the cell. Deficiencies in the production of 
hormones or cytokines may frequently be remedied by their administration to reinstate a 
physiological response. In contrast, many other cellular fractions, in particular the cell 
membrane and the nucleus, are known to include numerous proteins which, if they function 

25 improperly, may be the cause of severe diseases, including cancer and other proliferative 
disorders, and a variety of metabolic diseases. For example, breast cancer is ? in 30% of all 
occurrences, thought to be caused by overexpression of a cell surface receptor, i.e., the 
receptor tyrosine kinase HER2. Slamon, 1989, supra. It is believed that this type of breast 
cancer may be treated by administration of HER2 inhibitors to the appropriate target site. 

30 As an example of a metabolic disease, type II diabetes mellitus (Non-Insulin-Dependent- 
Diabetes Mellitus, NIDDM) may be caused by aberrant expression or defective insulin 
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receptors. Taira, 1989, supra. Type II diabetes could be treated with a therapeutic 
compound which surpasses the receptor and activates a molecule downstream of the insulin 
receptor's signal transduction pathway's. Alternatively, type II diabetes could be treated 
with compounds which are able to activate the defective insulin receptor. 
5 In the case of many diseases, both diseases related to inappropriate cell proliferation 

and diseases relating to metabolic defects, the biological cause is not yet identified, but is 
thought to be due to inappropriate function or abundance of cellular proteins. Proteins 
identified by the methods of the present invention will be useful for the development of 
targeted therapeutic approaches, including drug development and gene therapy, or as 
10 diagnostics/reagents. 

If a protein identified by the methods of the invention is used as therapeutic target, 
it may be subjected to suitable assays for the identification and isolation of compounds 
modulating its activity. 

15 F. Generation And Use Of Antibodies Directed Against Proteins Identified 

By The Methods Of The Invention 

Various procedures known in the art may be used for the production of 
antibodies to epitopes of the recombinant! y produced proteins identified and isolated 
employing the methods of the present invention. Such antibodies include but are not 

20 limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments 
produced by an Fab expression library. 

In one embodiment of the invention, such antibodies are used as tools for the 
investigation of expression pattern, abundance, function and other characteristics of the 
proteins identified by the methods of the invention. For example, cells or tissue sections 

25 may be exposed to antibodies, allowing to characterize expression specificity and 
abundance of the identified proteins of the invention. Various types of antibodies, 
including monoclonal, polyclonal and single chain antibodies may be used for such 
experiments determining cell and tissue specific expression pattern of the identified protein. 
Numerous techniques are well-established which allow detection of antibody binding to the 

30 expressed proteins in the cell or tissue section. 
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Furthermore, labelled antibodies, e.g., radiolabelled antibodies may be used to 
determine the expression patterns and abundance of the identified protein in vivo. See, 
supra. For example, the labelled antibody may be injected into a test subject, and its 
binding sites monitored in order to determine in which cell and/or tissue types the identified 
5 protein is located. To the extent that the originally identified and isolated protein of interest 
is the product of a species different from the test animal, the use of polyclonal antibodies is 
preferred. Thus, for example, if the identified protein used to generate the antibody is 
human, and the test subject is not human, polyclonal antibodies, with exceptions, will be 
preferred. 

10 Furthermore, antibodies may be employed to analyze the physiological function of 

the protein identified with the methods and expression systems of the invention. For 
example, inhibitory antibodies, i.e., antibodies binding to certain functional epitopes of the 
identified protein of interest may identified, rationally or empirically, and injected into test 
cells or laboratory animals. Effects on the cells, certain tissues or physiological functions 

15 

may be identified, allowing to elucidate the biological role of the identified protein and its 
potential involvement in aberrant or pathological conditions. Once the biological function 
and role in disease development is identified, development and design of targeted treatment 
strategies may be initiated. One of such strategies is the identification and isolation of 
compounds favorably modulating the function of a target protein, which is discussed in 

20 more detail below. 

In other embodiments, antibodies are useful, e.g., as diagnostic or therapeutic 
agents. As therapeutic agents, neutralizing antibodies, i.e., those which compete for 
binding with a ligand, substrate or adapter molecule, or interfering with the proteins of the 
invention, where the proteins are used as therapeutic targets, are of especially preferred 

25 interest. 

For use as diagnostic agents, monoclonal antibodies that bind to the identified 
protein are radioactively labelled allowing detection of their location and distribution in the 
body after injection. Radioactivity tagged antibodies may be used as a non-invasive 
diagnostic tool for imaging in vivo the presence of a tumors and metastases associated with 
30 the expression of the identified protein of the invention. 
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Immunotoxins may also be designed which target cytotoxic agents to specific sites 
in the body. For example, high affinity monoclonal antibodies may be covalently 
complexed to bacterial or plant toxins, such as diphtheria toxin, abrin, or ricin. A general 
method of preparation of antibody/hybrid molecules may involve use of thiol-crosslinking 
5 reagents such as SPDP, which attack the primary amino groups on the antibody and by 
disulfide exchange, attach the toxin to the antibody. The hybrid antibodies may be used to 
specifically eliminate cells expressing the protein identified by the methods of the 
invention. 

For the production of antibodies, various host animals are immunized by injection 

10 with the identified protein of interest including, but not limited to, rabbits, mice, rats, etc. 
Various adjuvants may be used to increase the immunological response, depending on the 
host species, including but not limited to Freund's adjuvance (complete and incomplete), 
mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, 
pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, 

15 dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette- 
Guerin) and Corynebacterium parvum. 

Monoclonal antibodies to the proteins of the invention may be prepared using any 
technique which provides for the production of antibody molecules by continuous cell lines 
in culture. These include, but are not limited to, the hybridoma technique originally 

20 described by Kohler and Milstein, 1975, Nature 256 :495-497, the human B-cell hybridoma 
technique (Kosbor et al, 1983, Immunology Today 4:72; Cote et al, 1983, Proc. Natl 
Acad ScL U.S.A. 80:2026-2030) and the EBV-hybridoma technique (Cole et a/., 1985, 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). In addition, 
techniques developed for the production of "chimeric antibodies" (Morrison et al. y 1984, 

25 Proc. Natl Acad ScL U.S.A. 81:6851-6855; Neuberger et al 9 1984, Nature 312:604-608; 
Takeda et ai 9 1985, Nature 314 :452-454) by splicing the genes from a mouse antibody 
molecule of appropriate antigen specificity together with genes from a human antibody 
molecule of appropriate biological activity can be used. Alternatively, techniques described 
for the production of single chain antibodies (U.S. Patent 4,946,778) can be adapted to 

30 produce single chain antibodies specific for the proteins of the invention. 



WO 99/25876 PCT/US98/24520 

-46- 

Antibody fragments which contain specific binding sites of the cell proliferation 
gene may be generated by known techniques. For example, such fragments include, but are 
not limited to, F(ab , ) 2 fragments which can be produced by pepsin digestion of the antibody 
molecule and the Fab fragments which can be generated by reducing the disulfide bridges 
5 of the F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed (Huse 
et aL 9 1989, Science 246:1275-1281) to allow rapid and easy identification of monoclonal 
Fab fragments with the desired specificity to the protein of the invention. 



G. Use Of The Identified Nucleic Acids Encoding A Target Protein Of 
jq Interest For Development Of Antisense Approaches And Ribozymes 

The development and the use of oligonucleotide or oligoribonucleotide 

sequences comprising antisense DNA or RNA molecules or ribozymes that function to 

inhibit the translation of the cell proliferation gene mRNA may fulfill either of two 

purposes. First, such approaches may be used to investigate the function of novel proteins 

jg identified with the methods of the present invention. Second, these approaches may serve 
as actual treatment methods once the function of a protein and its involvement in 
pathological conditions is established. For example, antisense DNA or RNA molecules act 
to directly block the translation of targeted gene by binding to the targeted mRNA and thus 
preventing protein translation. 

2Q Ribozymes are enzymatic RNA molecules capable of catalyzing the specific 

cleavage of RNA. The mechanism of ribozyme action appears to involve site specific 
hybridization of the ribozyme molecule to complementary sequences of the target RNA, 
followed by endonucleolytic cleavage. In one embodiment of the invention, ribozyme 
molecules are engineered that specifically catalyze endonucleolytic cleavage of mRNA of 

25 the genes identified with the methods of the invention. 

Suitable target sites for ribozyme activity are identified by first scanning the target 
molecule for potential ribozyme cleavage motifs, second by evaluating the structural 
features of the about 15 to 25 amino acids corresponding to the region of the target 
molecule containing the identified cleavage recognition site. Further, the suitability of the 

3Q candidate targets may also be evaluated by testing their accessibility to hybridization with 
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complementary oligonucleotides, using ribonuclease protection assays. Bordonaro et al , 

1 994, Biotechniques 16:428-430. 

The labelled hybridization probes antisense DNA and RNA oligonucleotides and 

ribozymes of the subject invention are prepared by any method known in the art for the 
5 synthesis of DNA and RNA molecules. For example, oligonucleotides may be synthesized 

chemically using commercially available DNA or RNA synthesizers like machines sold by 

Applied Biosystems. Alternatively, RNA molecules may be generated by in vitro and in 

vivo transcription of DNA sequences encoding the antisense RNA molecule. Such DNA 

sequences may be incorporated into a wide variety of vectors which comprise suitable RNA 
10 polymerase promoters such as the T3, T7, or the SP6 polymerase promoters. Alternatively, 

antisense cDNA constructs that synthesize antisense RNA constitutively or inducibly, may 

be introduced stably into cell lines. 

Various modifications to the DNA and RNA molecules may be introduced as a 

means of increasing the intracellular stability and half-life. For example, flanking 
15 sequences of ribo- or deoxyribo-nucleotides may be added to the 5' and/or 3' ends of the 

molecule, or phosphorothioate or 2* O-methyl rather than phosphodiester linkages may be 

used within the oligonucleotide backbone. Xu et al , 1 996, Nucleic Acid Res. 24: 1 602- 

1607. 

20 H. Generation Of Compounds Targeting The Identified, Isolated And 

Characterized Proteins 

1. Identification Of Compounds 

For the identification and isolation of compounds modifying, 
inhibiting or enhancing the function of an identified and characterized protein according to 

25 the invention, suitable cellular systems expressing the identified protein may be employed. 
Alternatively, the proteins identified by the process of the invention may be isolated and 
used for in vitro or in vivo assays for the identification and isolation of compounds 
specifically interfering with their activity. Generally, the type of assay employed will 
largely depend on the nature and functional characteristics of the identified protein used as 

30 tar § et > in the following referred to as "target protein". For example, if the target protein is a 
growth factor receptor, a cellular assay may be employed wherein the influence of 



WO 99/25876 PCT/US98/24520 

-48- 

compounds on proliferation is measured. If the target protein is a secreted protein, assays 
may be employed involving the effects of isolated secreted protein on suitable systems, e.g., 
receptor systems which allow one to measure the function of the secreted protein. If the 
target protein is a transcription factor, cellular systems may be employed which allow 

5 assaying the impact of compounds on expression of genes which are controlled by the 
transcription factor. If the target protein is an apoptosis regulator, cellular systems may be 
employed in which cell death (or survival) is driven by the respective target protein, and the 
impact of compounds may be assayed. 

More specifically, cells in an appropriate assay system expressing the target protein 

10 m ay be exposed to chemical compounds or compound libraries to identify compounds 

having the desired modulating effects. Alternatively, the target protein may be expressed in 
suitable expression systems, designed to allow for high-throughput testing of compounds 
from any source, optionally isolated, to identify molecules binding to or having measurable 
inhibitory effects on the target protein. 

15 Nucleotide sequences encoding the target protein identified and isolated using the 

methods of the invention may be used to produce the corresponding purified protein using 
well-known methods of recombinant DNA technology. Among the many publications that 
teach methods for the expression of genes after they have been isolated is Gene Expression 
Technology, Methods and Enzvmologv. Vol.: 185 , edited by Goeddel, Academic Press, San 

20 Diego, California (1 990). 

The protein of the invention chosen as target protein may be expressed in a variety 
of host cells, either prokaryotic or eukaryotic. In many cases, the host cells would be 
eukaryotic, more preferably host cells would be mammalian. Host cells may be from 
species either the same or different than the species from which the nucleic acid sequences 

25 encoding the protein identified with the methods of the invention are naturally present, i.e., 
endogenous. Advantages of producing the chosen target protein by recombinant DNA 
technology in cellular expression systems other than the expression systems primarily used 
see supra, for the original identification and isolation of the proteins according to their 
cellular localization include the development of optimized assay systems for the 

30 identification of modulating compounds. Generally, the expression systems of the 

invention have the advantage that they readily provide a system for the production of large 
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amounts of recombinant proteins. However, under certain circumstances which the skilled 
artisan will appreciate, alternative expression systems may, in some instances, also prove 
advantageous for obtaining highly enriched sources of the proteins for purification and the 
availability of simplified purification procedures. Methods for recombinant production of 
5 proteins are generally very well established in the art, and can be found, among other places 
in Sambrock et ah , supra. 

In an embodiment of the invention, cells transformed with expression vectors 
encoding the identified protein of the invention are cultured under conditions favoring 
expression of its gene sequence and the recovery of the recombinantly-produced protein 

10 from the cell culture. A target protein of interest produced by a recombinant cell may be 
secreted or may be contained intracellularly, depending on the nature of the gene and the 
particular genetic construction used. In general, it is more convenient to prepare 
recombinant proteins in secreted form. Purification steps will depend on the nature of the 
production and the particular protein produced. Purification methodologies are well 

15 established in the art; the skilled artisan will know how to optimize the purification 
conditions. General protocols of how to optimize the purification conditions for a 
particular protein can be found, among other places, in Scopes in: Protein Purification: 
Principles and Practice . 1982, Springer- Verlag New York, Heidelberg, Berlin. 

In addition to recombinant production, peptide fragments may be produced by direct 

20 peptide synthesis using solid-phase techniques. See, Stewart et ah, Solid-Phase Peptide 
Synthesis (1969), W. H. Freeman Co., San Francisco; and Merrifield, 1963, J. Am. Chem. 
Soc. 85:2149-2154. 

In vitro polypeptide synthesis may be performed using manual techniques or by 
automation. Automated synthesis may be achieved, for example, using Applied Biosystems 
25 43 1 a Peptide Synthesizer (Foster City, California) following the instructions provided in 
the instruction manual supplied by the manufacturer. 

In an embodiment of the invention, the target protein and/or cell lines expressing the 
target protein are used to screen for antibodies, peptides, organic molecules or other ligands 
that act as agonist or antagonists of the cell proliferation gene activity. For example, 
30 antibodies capable of interfering with the activity, e.g., enzymatic activity of the target 
protein, or with its interaction with a ligand, adapter molecule, or substrate are used to 
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inhibit the target protein's function. In cases where amplification of the target protein's 
function is desired, antibodies which mimic, e.g. , a ligand, an adapter molecule or substrate 
of the corresponding the signal transduction pathway may be developed. Obviously, if 
desired, antibodies may be generated which modify the activity, function, or specificity of 

5 the target protein. 

Alternatively, screening of peptide libraries or organic compounds with 
recombinantly expressed target protein or cell lines expressing the target protein may be 
useful for identification of therapeutic molecules that function by inhibiting, enhancing, or 
modifying its biological activity. 

10 Synthetic compounds, natural products, and other sources of potentially biologically 

active materials can be screened in a number of ways. The ability of a test compound to 
inhibit, enhance or modulate the function of the target protein may be determined with 
suitable assays measuring the target protein's function. For example, responses such as its 
activity, e.g., enzymatic activity, or the target protein's ability to bind its ligand, adapter 

15 molecule or substrate may be determined in in vitro assays. Cellular assays can be 

developed to monitor a modulation of second messenger production, changes in cellular 
metabolism, or effects on cell proliferation. These assays may be performed using 
conventional techniques developed for these purposes. Finally, the ability of a test 
compound to inhibit, enhance or modulate the function of the target protein will be 

20 measured in suitable animal models in vivo. For example, mouse models will be used to 
monitor the ability of a compound to inhibit the development of solid tumors, or effect 
reduction of the solid tumor size. 

In an embodiment of the invention, random peptide libraries consisting of all 
possible combinations of amino acids attached to a solid phase support are used to identify 

25 peptides that are able to interfere with the function of the target protein. For example, 

peptides may be identified binding to a ligand-, adapter molecule- or substrate binding site 
of a given target protein or other functional domains of the target protein, such as an 
enzymatic domain. Accordingly, the screening of peptide libraries may result in 
compounds having therapeutic value as they interfere with its activity. 

30 Identification of molecules that are able to bind to the target protein may be 

accomplished by screening a peptide library with recombinant soluble target protein. 
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Methods for expression and purification of the selected target protein and may be used to 
express recombinant full length target protein or fragments thereof, depending on the 
functional domains of interest. 

In order to identify and isolate the peptide/solid phase support that interacts and 
5 forms a complex with the target protein, it is necessary to label or "tag" the target protein 
molecule or fragment thereof. For example, the target protein may be conjugated to 
enzymes such as alkaline phosphatase or horseradish peroxidase or to other reagents such as 
fluorescent labels which may include fluorescein isothyiocynate (FITC), phycoerythrin 
(PE) or rhodamine. Conjugation of any given label to the target protein may be performed 

10 using techniques that are routine in the art. 

In addition to using soluble target protein molecules or fragments thereof, in another 
embodiment, peptides that bind to the target protein may be identified using intact cells. 
The use of intact cells is preferred in instances where the target protein which comprises 
cell surface receptors, which require the lipid domain of the cell membrane to be 

15 functional. Methods for generating cell lines expressing the target protein identified with 
the methods and expression systems of the invention. The cells used in this technique may 
be either live or fixed cells. The cells are incubated with the random peptide library and 
will bind to certain peptides in the library. The so formed complex between the target cells 
and the relevant solid phase support/peptide may be isolated by standard methods known in 

20 the art, including differential centrifugation. 

In the case the target protein is a membrane bound receptor or a receptor that 
requires the lipid domain of the cell membrane to be functional, an alternative to whole cell 
assays is to reconstitute the receptor molecules into liposomes where a label or "tag" can be 
attached. 

25 in another embodiment, cell lines that express the chosen target protein or, 

alternatively isolated target protein or fragments thereof, are used to screen for molecules 
that inhibit, enhance, or modulate the target protein's activity or, where applicable, signal 
transduction. Such molecules may include small organic or inorganic compounds, or other 
molecules that effect the target protein activity or that promote or prevent the complex 

30 formation with its ligand, adapter molecules, or substrates. Synthetic compounds, natural 
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products, and other sources of potentially biologically active materials can be screened in a 
number of ways, which are generally known by the skilled artisan. 

For example, the ability of a test molecule to interfere with the chosen target 
protein's function may be measured using standard biochemical techniques. Alternatively, 

5 cellular responses such as activation or suppression of a catalytic activity, phosphorylation, 
dephosphorylation, or other modification of other proteins, activation or modulation of 
second messenger production, changes in cellular ion levels, association, dissociation or 
translocation of signalling molecules, or transcription or translation of specific genes may 
also be monitored. These assays may be performed using conventional techniques 

10 developed for these purposes in the course of screening. 

Further, effects on the target protein's function may, via signal transduction 
pathways, affect a variety of cellular processes. Cellular processes under the control of the 
its signalling pathway may include, but are not limited to, normal cellular functions, 
proliferation, differentiation, maintenance of cell shape, and adhesion, in addition to 

15 abnormal or potentially deleterious processes such as unregulated cell proliferation, loss of 
contact inhibition and, blocking of differentiation or cell death. The qualitative or 
quantitative observation and measurement of any of the described cellular processes by 
techniques known in the art may be advantageously used as a means of scoring for signal 
transduction in the course of screening. 

20 Various technologies may be employed for the screening, identification, and 

evaluation of compounds which interact with the chosen target protein of the invention, 
which compounds may affect various cellular processes under the control of said target 
protein. 

For example, the target protein or a functional derivative thereof, in pure or semi- 
25 pure form, in a membrane preparation, or in a whole live or fixed cell is incubated with the 
compound. Subsequently, under suitable conditions, the effect of the compound on the 
target protein's function is scrutinized, e.g., by measuring its activity, or its signal 
transduction, and comparing the activity to that of the target protein, incubated under same 
conditions, without the compound, thereby determining whether the compound stimulates 
30 or inhibits the target protein's activity. 
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In addition to the use of whole cells expressing the chosen target protein for the 
screening of compounds, the invention also includes methods using soluble or immobilized 
target protein. For example, molecules capable of binding to the target protein may be 
identified within a biological or chemical preparation. For example, the target protein, or 

5 functional fragments thereof, e.g., fragments containing a specific domain of interest, is 
immobilized to a solid phase matrix, subsequently a chemical or biological preparation is 
contacted with the immobilized target protein for an interval sufficient to allow the 
compound to bind. Any unbound material is then washed away from the solid phase 
matrix, and the presence of the compound bound to the solid phase is detected, whereby the 

10 compound is identified. Suitable means are then employed to elute the binding compound. 

2. Source Of Candidate Test Compounds 

The test compounds employed for assays for the identification of 
modulators of a target protein's activity are obtained from any commercial source, 

15 including Aldrich (1001 West St. Paul Ave., Milwaukee, WI 53233), Sigma Chemical 

(P.O. Box 14508, St. Louis, MO 63178), Fluka Chemie AG (Industriestrasse 25, CH-9471 
Buchs, Switzerland (Fluka Chemical Corp. 980 South 2nd Street, Ronkonkoma, NY 
1 1779)), Eastman Chemical Company, Fine Chemicals (P.O Box 431, Kingsport, TN 
37662), Boehringer Mannheim GmbH (Sandhofer Strasse 116, D-68298 Mannheim), 

20 Takasago (4 Volvo Drive, Rockleigh, NJ 07647), SST Corporation (635 Brighton Road, 
Clifton, NJ 07012), Ferro (111 West Irene Road, Zachary, LA 70791), Riedel-deHaen 
Aktiengesellschaft (P.O. Box D-30918, Seelze, Germany), PPG Industries Inc., Fine 
Chemicals (One PPG Place, 34th Floor, Pittsburgh, PA 15272); further any kind of natural 
products may be screened using the assay cascade of the invention, including microbial, 

25 fungal or plant extracts. 



3. Indications For The Use Of Compounds Modulating The 

Activity Of Target Proteins Of The Invention According To Its 
Predetermined Property 

Depending on the nature and the function of the target protein, the 
compounds identified by the above exemplified assays and methods may be modulators of 
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cell proliferation activity or modulators of certain metabolic functions. As such, the 
compounds produced by the processes and assays of the invention are useful for the 
treatment of disease related to aberrant, uncontrolled or inappropriate cell proliferation, 
including excess or diminished proliferation. Alternatively, the compounds may be useful 

5 for the treatment of diseases relating to aberrant metabolic functions. 

A large number of disease states involve excess or diminished cell proliferation. 
Generally, many of these diseases may be treated with DNA sequences, proteins, or small 
molecules that influence cell proliferation. In some instances the goal is to stimulate 
proliferation; in others, to prevent or inhibit proliferation of cells. The list of diseases 

10 directly involving cell growth includes, but is not limited to, cancer, psoriasis, 

inflammatory diseases, such as rheumatoid arthritis, restenosis, immunological activation or 
suppression, including tissue rejection, neurodegeneration or expansion of neuronal cells 
and viral infection. 

Numerous diseases involve aberrant metabolic functions, including, but not limited 
15 to, diabetes mellitus, fibrosis, cystic fibrosis. Further, pathological conditions relate both to 
aberrant cell proliferation and metabolic defects. 

Accordingly, pharmaceutical compositions comprising a therapeutically effective 
amount of a compound identified by the methods described above will be useful for the 
treatment of diseases driven by unregulated or inappropriate cell proliferation, including 
20 cancer, such as glioma, melanoma, Kaposi's sarcoma, psoriasis, hemangioma and ovarian, 
breast, lung, pancreatic, prostate, colon and epidermoid cancer, rheumatoid arthritis, 
restenosis, immunological activation or suppression, including tissue rejection, 
neurodegeneration or expansion of neuronal cells; and diseases relating to metabolic 
dysfunction, including, but not limited to diabetes mellitus, fibrosis, cystic fibrosis. 

25 

I. Formulations/Route Of Administration 

The present invention provides two forms of therapeutic compounds. First, 
the invention provides therapeutics which resemble a naturally occurring functional 
polypeptide or peptide, or functional fragment or derivative thereof. Such therapeutics are, 
30 for example, proteins which are secreted in their natural form; such proteins may be 

identified directly with the expression systems of the invention. This type of therapeutics 



QMCrVVirv 0£OC07C4 1 I - 



WO 99/25876 



PCT/US98/24520 



-55- 



includes, but are not limited to, cytokines, hormones, growth factors, etc. Second, the 
invention provides therapeutic compounds which modulate the function of target proteins 
identified using the methods and expression systems of the invention. Typically, these 
types of compounds are small organic molecules, but they also include peptide compounds 
5 and antibodies. The skilled artisan will appreciate that the following methods of 

administration, formulation and treatment methods will need to be adjected depending on 
the type of compound chosen, and the type of disease to be treated. 

The identified therapeutic compound can be administered to a human patient alone 
or in pharmaceutical compositions where they are mixed with suitable carriers or 

10 excipient(s) at therapeutically effective doses to treat or ameliorate a variety of disorders. 
A therapeutically effective dose further refers to that amount of the compound sufficient to 
result in amelioration of symptoms as determined, for example, in a decrease or increase of 
cell proliferation, or in a restitution of metabolic functions. Techniques for formulation and 
administration of the compounds of the instant application may be found in "Remington's 

15 Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest edition. 



rectal, transmucosal, or intestinal administration; parenteral delivery, including 
intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct 
intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections. 

Alternately, one may administer a compound of the invention in a local rather than 
systemic manner, for example, via injection of the compound directly into a solid tumor, 
often in a depot, or in a sustained release formulation. 

Furthermore, one may administer the drug via a targeted drug delivery system, for 
example, in a liposome coated with tumor-specific antibody. The liposomes will be 
targeted to and taken up selectively by the tumor. 



1. 



Routes Of Administration. 



Suitable routes of administration may, for example, include oral, 



2. 



Composition/Formulation 



30 
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The pharmaceutical compositions of the present invention may be 
manufactured by means of conventional mixing, dissolving, granulating, dragee-making, 
levigating, emulsifying, encapsulating, entrapping or lyophilizing processes. 

Pharmaceutical compositions for use in accordance with the present invention thus 
5 may be formulated in a conventional manner using one or more physiologically acceptable 
carriers comprising excipients and auxiliaries which facilitate processing of the active 
therapeutic compounds into preparations which can be used pharmaceutically. Proper 
formulation is dependent upon the route of administration chosen. 

For injection, the agents of the invention may be formulated in aqueous solutions, 
10 preferably in physiologically compatible buffers such as Hank's solution, Ringer's solution, 
or physiological saline buffer. For transmucosal administration, penetrants appropriate to 
the barrier to be permeated are used in the formulation. Such penetrants are generally 
known in the art. 

For oral administration, the therapeutic compounds can be formulated readily by 
15 combining the active therapeutic compounds with pharmaceutically acceptable carriers well 
known in the art. Such carriers enable the therapeutic compounds of the invention to be 
formulated as tablets, pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions and 
the like, for oral ingestion by a patient to be treated. Pharmaceutical preparations for oral 
use can be obtained as a solid excipient, optionally grinding a resulting mixture, and 
20 processing the mixture of granules, after adding suitable auxiliaries, if desired, to obtain 
tablets or dragee cores. Suitable excipients include fillers such as sugars, including lactose, 
sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, 
wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 
hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 
25 polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. 

Dragee cores are provided with suitable coatings. For this purpose, concentrated 
sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl 
30 pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, 
and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be added to 
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the tablets or dragee coatings for identification or to characterize different combinations of 
active therapeutic compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 
made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
5 glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
with fillers such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active therapeutic 
compounds may be dissolved or suspended in suitable liquids, such as fatty oils, liquid 
paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added. All 

10 formulations for oral administration should be in dosages suitable for such administration. 

For buccal administration,the compositions may take the form of tablets or lozenges 
formulated in conventional manner. 

For administration by inhalation, the therapeutic compounds for use according to 
the present invention are conveniently delivered in the form of an aerosol spray 

15 presentation from pressurized packs or a nebulizer, with the use of a suitable propellant, 
e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be 
determined by providing a valve to deliver a metered amount. Capsules and cartridges of, 
e.g., gelatin, for use in an inhaler or insufflator, may be formulated containing a powder 

20 mix of the therapeutic compound and a suitable powder base such as lactose or starch. 

The therapeutic compounds may be formulated for parenteral administration by 
injection, e.g., by bolus injection or continuous infusion. Formulations for injection may be 
presented in unit dosage form, e.g., in ampoules or in multi-dose containers, with an added 
preservative. The compositions may take such forms as suspensions, solutions or 

25 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as 
suspending, stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions 
of the active therapeutic compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. Suitable 

30 lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic fatty acid 
esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection suspensions 
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may contain substances which increase the viscosity of the suspension, such as sodium 
carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may also contain 
suitable stabilizers or agents which increase the solubility of the therapeutic compounds to 
allow for the preparation of highly concentrated solutions. 

5 Alternatively, the active ingredient may be in powder form for constitution with a 

suitable vehicle, such as sterile pyrogen-free water, before use. 

The therapeutic compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 
cocoa butter or other glycerides. 

10 In addition to the formulations described previously, the therapeutic compounds 

may also be formulated as a depot preparation. Such long acting formulations may be 
administered by implantation (for example subcutaneously or intramuscularly) or by 
intramuscular injection. Thus, for example, the therapeutic compounds may be formulated 
with suitable polymeric or hydrophobic materials (for example as an emulsion in an 

15 acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a 
sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic therapeutic compounds of the 
invention is a cosolvent system comprising benzyl alcohol, a nonpolar surfactant, a water- 
miscible organic polymer, and an aqueous phase. 

20 The cosolvent system may be the VPD co-solvent system. VPD is a solution of 3% 

w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent 
system (VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This 
co-solvent system dissolves hydrophobic therapeutic compounds well, and itself produces 

25 low toxicity upon systemic administration. Naturally, the proportions of a co-solvent 
system may be varied considerably without destroying its solubility and toxicity 
characteristics. Furthermore, the identity of the co-solvent components may be varied: for 
example, other low-toxicity nonpolar surfactants may be used instead of polysorbate 80; the 
fraction size of polyethylene glycol may be varied; other biocompatible polymers may 

30 replace polyethylene glycol, e.g., polyvinyl pyrrolidone; and other sugars or 
polysaccharides may be substituted for dextrose. 
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Alternatively, other delivery systems for hydrophobic pharmaceutical therapeutic 
compounds may be employed. Liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually with a greater toxicity. 
5 Additionally, the therapeutic compounds may be delivered using a sustained-release 

system, such as semipermeable matrices of solid hydrophobic polymers containing the 
therapeutic agent. Various sustained-release materials have been established and are well 
known by those skilled in the art. Sustained-release capsules may, depending on their 
chemical nature, release the therapeutic compounds for a few weeks up to over 100 days. 

10 Depending on the chemical nature and the biological stability of the therapeutic 

reagent, additional strategies for protein stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited to 
calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 

15 gelatin, and polymers such as polyethylene glycols. 

Many of the therapeutic compounds of the invention may be provided as salts with 
pharmaceutical ly compatible counterions. Pharmaceutically compatible salts may be 
formed with many acids, including but not limited to hydrochloric, sulfuric, acetic, lactic, 
tartaric, malic, succinic, etc. Salts tend to be more soluble in aqueous or other protonic 

20 solvents that are the corresponding free base forms. 



25 



30 
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3. Effective Dosage. 

Pharmaceutical compositions suitable for use in the present invention 
include compositions wherein the active ingredients are contained in an effective amount to 
achieve its intended purpose. More specifically, a therapeutically effective amount means 

5 an amount effective to prevent development of or to alleviate the existing symptoms of the 
subject being treated. Determination of the effective amounts is well within the capability 
of those skilled in the art, especially in light of the detailed disclosure provided herein. 

For any therapeutic compound used in the method of the invention, the 
therapeutically effective dose can be estimated initially from cell culture assays. For 

10 example, a dose can be formulated in animal models to achieve a circulating concentration 
range that includes the IC 50 as determined in cell culture (/.e, the concentration of the test 
therapeutic compound which achieves a half-maximal inhibition or activation of the cell 
proliferation activity, or restitution of a metabolic function). Such information can be used 
to more accurately determine useful doses in humans. 

15 A therapeutically effective dose refers to that amount of the therapeutic compound 

that results in amelioration of symptoms or a prolongation of survival in a patient. Toxicity 
and therapeutic efficacy of such therapeutic compounds can be determined by standard 
pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining 
the LD 50 (the dose lethal to 50% of the population) and the ED 50 (the dose therapeutically 

20 effective in 50% of the population). The dose ratio between toxic and therapeutic effects is 
the therapeutic index and it can be expressed as the ratio between LD 50 and ED 50 . 
Therapeutic compounds which exhibit high therapeutic indices are preferred. 

The data obtained from these cell culture assays and animal studies can be used in 
formulating a range of dosage for use in human. The dosage of such therapeutic 

25 compounds lies preferably within a range of circulating concentrations that include the 
ED 50 with little or no toxicity. The dosage may vary within this range depending upon the 
dosage form employed and the route of administration utilized. The exact formulation, 
route of administration and dosage can be chosen by the individual physician in view of the 
patient's condition. (See, e.g., Fingl etaL, 1975, in "The Pharmacological Basis of 

30 Therapeutics", Ch. 1 pi). 
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Dosage amount and interval may be adjusted individually to provide plasma levels 
of the active moiety which are sufficient to maintain the kinase modulating effects, or 
minimal effective concentration (MEC). The MEC will vary for each therapeutic 
compound but can be estimated from in vitro data; e.g., the concentration necessary to 
5 achieve 50-90% inhibition of the kinase using the assays described herein. Dosages 
necessary to achieve the MEC will depend on individual characteristics and route of 
administration. However, HPLC assays or bioassays can be used to determine plasma 
concentrations. 

Dosage intervals can also be determined using MEC value. Therapeutic compounds 
10 should be administered using a regimen which maintains plasma levels above the MEC for 
10-90% of the time, preferably between 30-90% and most preferably between 50-90%. In 
cases of local administration or selective uptake, the effective local concentration of the 
drug may not be related to plasma concentration. 

The amount of composition administered will, of course, be dependent on the 
15 subject being treated, on the subjects weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4. Packaging. 

The compositions may, if desired, be presented in a pack or dispenser 
20 device which may contain one or more unit dosage forms containing the active ingredient. 
The pack may for example comprise metal or plastic foil, such as a blister pack. The pack 
or dispenser device may be accompanied by instructions for administration. Compositions 
comprising a therapeutic compound of the invention formulated in a compatible 
pharmaceutical carrier may also be prepared, placed in an appropriate container, and 
25 labelled for treatment of an indicated condition. Suitable conditions indicated on the label 
may include inhibition or activation of cell proliferation, restitution of a metabolic function, 
treatment of a tumor, treatment of arthritis, and the like. 



30 



The following examples for the generation and use of the selection systems of the 
invention are given to enable those skilled in the art to more clearly understand and to 
practice the present invention. The present invention, however, is not limited in scope by 



WO 99/25876 PCT/US98/24520 

-62- 

the exemplified embodiments, which are intended as illustrations of single aspects of the 
invention only, and methods which are functionally equivalent are within the scope of the 
invention. Indeed, various modifications of the invention in addition to those described 
herein will become apparent to those skilled in the art from the foregoing description and 
5 accompanying drawings. Such modifications are intended to fall within the scope of the 
appended claims. 



VIL EXAMPLES 

A. Materials And Methods 

10 The following are experimental procedures and materials used for the 

Examples set forth below. 

Construction Of cDNA Libraries Into The SinRepS Vector. Total mRNA 

of any tissue or cell line is isolated by using the QuickPrep (Micro mRNA purification kit 

(Pharmacia, Pharmacia Biotech Europe GmbH, Duebendorf, Switzerland) according to 
15 supplier's recommendation. The first strand synthesis is carried out in the presence of 5- 

methyl-cytosine to protect internal StuI restriction sites, using the Stratagene (Stratagene 

AG, Basel, Switzerland) ZAP cDNA synthesis kit according to supplier's instructions. 

After brief digestion with RNaseH, the second strand is synthesized using DNA polymerase 

I to produce 3' overhangs which can be recessed by T4 DNA polymerase (3' exonuclease 
20 activity). The double stranded DNA is finally ligated into the pSinRepS vector (Invitrogen, 

NV Leek, Netherl; see FIGURE 5 and Bredenbeek et ai 9 1993, J, Virol JJ_:6439-6446). 

Subsequent digestion of the resulting DNA with StuI linearizes the plasmids for in vitro 

transcription. 

The construction of cDNA libraries in the TE vectors (See FIGURE 6 and Frolov et 
25 al. 9 1996, PNAS 93:1 1371-1 1377; the complete sequence of pTE5'2J is given in SEQ ID 
NO:l) is carried out essentially as described for the pSinRepS. The only exception is that 
during the second strand synthesis 5-methyl -adenosine is used to protect internal Xbal 
restriction sites to clone the resulting DNA fragment into the pTE plasmid pTE SEAP via 
Xbal/StuI or Apal. 

30 The plasmids used in this study are described in detail in Table I. 
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TABLE I 



Description of the DNA construct: 



10 



15 



20 



pSinRepS construct: 

pSinRep5: 



SP6 promotor: 
Non-structural genes: 
Subgenomic promotor: 
Transcriptional start: 
Multiple cloning site: 
poly A tail: 

Amp resistance gene: 
ColEl origin: 
pSinRepS EPO: 

pSinRepS SEAP: 

pSinRepS lacZ: 



from Invitrogen Sindbis expression system 
European Headquarters: Invitrogen, NV Leek, 
Netherland 

Bredenbeeke/ a/., Nov. 1993, J. Virology 67: 6439- 
6446. 

9933-9951 

60-7598 

7580-7603 

7598 

7647-7689 
7997-8033 
8227-9085 
9232-9861 

pSinRep 5 digested with Xbal/SphI was ligated with a 
synthetic EPO sequence (Xbal/SphI) sequence of EPO 
see appendix: sequence 1 

pSinRepS digested with Xbal/StuI and ligated with 
Nhel/Clal fragment of pSEAP 2 Basic, Clontech 
Laboratories, Inc., Palo Alto, USA. 

from Invitrogen Sindbis expression system. 



25 



30 



pTE constructs: 

pTE 5'2J: 



pTE 5*2J EPO: 



A general description of the related vectors pTE 3 f 2J is 
given in: 

Hahn et al, 1992, Proc. Natl Acad. ScL USA 
89 ? :2679-2683. 

Frolov et al, 1996, Proc. Natl Acad. ScL USA 

93:1 1371-1 1377. The detailed sequence of TE 5'2J is 

given as SEQ ID NO: 1. 

pTE was digested with Xbal/Apal was ligated with the 
Xbal/Apal EPO fragment derived from pSinRepS. 
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pTE 5'2J SEAP: 



pTE was digested with Apal, the ends were treated 
with Klenow enzyme before digestion with Xbal. The 
vector was ligated with the Nhel/Clal fragment from 
pSEAP 2 Basic {Clontech Laboratories, Inc., Palo Alto 
USA). 



5 



pTE 5'2J CAT: 



pTE 5'2J was digested with Xbal, the ends were filled 
according to standard procedures with Klenow enzyme 
CAT PCR fragment as shown in appendix: sequence 3, 
was ligated blunt into vector TE 5'2J. 



Helper constructs: 



10 



987 BB neo: 



DHEB: 



Bredenbeek^tf/., 1993, J. Virology 67 :6439-6446. 
(SEQ ID NO: 2). 



In Vitro Transcription OfSindbis cDNA Libraries. The linearized vector DNA 
(pSinRepS or pTE, linearized by Noil digestion; helper DNA DH-EB (Bredenbeek et al^ 

15 1993, J. Virol J_L: 6439-6446) linearized by EcoRI digestion) were made RNase-free by 
purification over QiaQuick PCR purification columns (QIAGEN AG, Basel, Switzerland) 
and elution with DEPC-H20. Subsequent in vitro transcription was carried out using an 
SP6 in vitro transcription kit (Invitroscript CAP Kit, Invitrogen BV, NV Leek, The 
Netherlands) according to the manufacturer's recommendation. The resulting 5'-capped 

20 mRNA was analyzed on reducing and non-reducing agarose gels. 

Generation Of Sindbis Virus Particles. Two (2) to five (5) \ig of in vitro 
transcribed mRNA was electroporated (for TE constructs) or co-electroporated (for 
PSinRep 5 and helper DH-EB) into BHK 21 cells (ATCC No. CCL10) according to the 
Invitrogen 's manual (Invitroscript CAP Kit, Invitrogen BV, NV Leek, The Netherlands). 

25 The 5-capped mRNA of pSinRepS encodes the viral non-structural proteins, which induce 
the viral replication steps. The co-electroporated helper mRNA (DH-EB) delivers the viral 
structural proteins. After incubation for eighteen (1 8) hours at 37°C, 5% C0 2 in Turbodoma 
HP-1 medium containing 1% FCS, the cell supernatant was harvested and the amount of 
released infectious virus particles was determined by plaque assays. 

30 Plaque Assay. Dilution series of the harvested virus particles, see, supra, were 



carried out in 1 ml Turbodoma HP-1 (1:10\ 1: 5x 104, 1:10 s , 1: 5x 10 5 , 1:10 6 , 1: 5x 10 6 ) on 
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90% confluent BHK 21 cells in 6-well-plates. After two (2) hours incubation at 37°C (in 
the case of the temperature-sensitive mutants from the complementation group C,D and E, 
the plates were incubated at the permissive temperature of 30°C (Lindquist et al 9 1986, 
Virology \5\: 1 0-20; Arias et al , J. Moi Biol J68 :87- 102; Carleton and Brown, 1 996, J. 
5 Virol 70:952-959) the medium was replaced with 2 ml 41°C warm 0.8% agarose (Carl 

Roth GmbH, Karlsruhe, Germany) in Turbodoma HP-1 medium (Dr. F. Messi, Cell Culture 
Technologies, Zurich, CH). Plaques of 1 to 4 mm diameter had been formed after two (2) 
days of incubation at the permissive temperature. The plaques were counted and the 
corresponding numbers of plaque forming units ("pfu") per ml were calculated. 

10 Agarose Blot Assay (ABA). 90% confluent BHK 21 cell layers in 60 mm dishes 

were incubated at the permissive temperature for two (2) hours with HP-1 medium 
containing approximately 800 to 1000 plaque forming units. The supernatant was replaced 
by overlaying the cells with 1.5 mm layer of 41°C warm 0.8% agarose in lx HP-1 and the 
cells were incubated for two (2) days at the permissive temperature. Secreted proteins were 

15 detected by 35 S Met/Cys labeling of the infected cells. For this purpose, the medium was 
replaced for 30 minutes with 2 ml of RPMI 1640 (Met/Cys deficient medium) before 20 
\xd 35 S Met/Cys (Hartmanii Analytic GmbH, Braunschweig, Germany) was added. 

Release of viral proteins was inhibited by adding a protease inhibitor cocktail 
and/or crosslinked antibodies against the Sindbis Virus glycoproteins as described in the 

20 examples. The temperature-sensitive Sindbis mutants were shifted to 40°C for two (2) hours 
pre-labeling to inhibit virus particle release. The 20 \xCi 35 S Met/Cys were applied for 2 
hours at 37°C, before the cells and the agarose layer were washed twice with 1 ml of RPMI 
1640 (Met/Cys deficient medium). Afterwards pre-wetted nitrocellulose membranes were 
placed on top of the agarose for 4 to 20 hours. The membrane were removed and washed 

25 with buffers containing between 0. 1 % Triton XI 00 as described in the examples (Sigma, 
St. Louis, USA) and were exposed to X-ray films (Hyperfilm bmax, Amersham, Sweden) 
after drying. Black spots on the developed X-ray film indicated Sindbis Virus particles 
containing the cDNA for a secreted protein. 

Amplification Of Selected Virus Particles. Positive plaques were identified by 

30 superposition of the X-ray film with the agarose overlay. The corresponding plaques were 
picked with a 10 ^1 tip and eluted in 200 \x\ PBS overnight at 4°C. After pelleting the 
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agarose, 60% confluent BHK 21 cells in a 12-well plate were infected at the permissive 
temperature with the eluted virus for two (2) hours in 0.5 ml Turbodoma HP-1 before the 
medium was replaced with 1 ml Turbodoma HP-1 medium. After incubated for 20 hours at 
the permissive temperature the cell supernatant was harvested and a dilution of 1 .20 in 2 ml 

5 Turbodoma HP-1 was incubated for two (2) hours in a T25 flask with 60% confluent BHK 
21 cells, before the medium was replaced with 10 ml HP-1 medium. After 20 hours 
incubation the resulting supernatant contained about 104 to 1 06 pfu per ml, which 
corresponds to a total of about 10 5 to 1 0 7 pfu per T25 supernatant. 

Sequencing Of The Selected Sindbis Virus Particles. The virus mRNA was 

10 isolated using a viral RNA isolation kit (High Pure Viral RNA Kit, Boehringer Mannheim, 
Mannheim, Germany) and RT-PCR was done with Superscript one-step RT-PCR according 
to Gibco (Gibco/BRL, Life Technologies AG, Basel, Switzerland) according to the 
recommedation of the manufactorer. The 3' primer for the RT-PCR was either the oligo 
GCGCGGGCCCT 20 (specific for the poly A tail) or the oligo GCGCGGGCCCCGCTG 

15 CGTGGCATTATGCACC specific for the 3' CIS sequence of the Sindbis Virus; the 5' 
primer was the oligo GCGCGGGCCCCGCTGCGTGGCATTATGCACC. The RT-PCR 
product was digested with the restriction enzymes Bspl20I and EcoRI, gel-purified and 
ligated in pBluescript (digested with EcoRI and Bspl20I) and finally sequenced using the 
oligos "-40 forward and reverse primer" of the multiple cloning site. 

20 Immunofluorescence Microscopy of Sindbis Virus Infected Cells. 60% confluent 

BHK 21 cells in 6well plates were infected for 2 hours at the permissive temperature at a 
moi of 0.1 in 1 mi HP-1 medium with Sindbis virus particles containing a heterologous 
cDNA encoding a membrane protein (receptor). 20 hours post-infection, the cells were 
dissociated with cell dissociation solution (Sigma-Aldrich, Steinheim, Germany) and were 

25 washed twice at 4°C with 1% BSA in HBSS. The resuspended cells were then incubated 
with ligand-flag fusion protein at concentrations between 100 ng/ml and 1 jig and with 5 
|ig/ml monoclonal antibody M2 for 1 5 min on ice. The cells were washed twice before 
being incubated for 30 min with 10 ^g/ml FITC-conjugated secondary antibody in 1% BSA 
in HBSS at 4°C. After two (2) washing steps with 1% BSA in HBSS, the cells were 

30 analyzed with an immunfluorescence microscope (Leica DMIL, Leica, Heerburg, 
Switzerland) or the cells were analyzed and sorted with a FACS sorter. 
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Cloning Ofts2 9 ts20 And The Double Mutation ts2 9 20 In DH-EB. The mutations 
ts2, ts20 and the combination of these two mutations were inserted into DH-EB 
(Bredenbeek et aL, 1993, J. Virology p. 6439-6446) by PCR using the following 
oligonucleotides: 

5 

pimers: oligo Aat II 

oligo BssH II 
oligoHpal.l 
oligo Hpa 1.2 
10 oligo Pvu LI 

oligo Pvu 1.2 



GCACTTAAGTTGGAGGCCGAC 

GGCACTCACGGCGCGCTTTACAGGC 

GCAATGTtAAcaGGTCTGTATCTAATTGG 

CAGACCtgTTaACATTGCTCACCACCAGG 

GACAGCGGTCGatCGATCATGGATAACTC 

GAGTTATCCATGATCGatCGACCGCTGTC 



The restriction sites used for analysis are underlined, mutated nucleotides are 
indicated in small letters and bold. PCR reactions were performed using the following 
15 combinations: 



PCR1 : oligo Aat II - oligo Hpa LI (for ts20) 
PCR2: oligo BssH II - oligo Hpa 1.2 (for ts20) 
PCR3: oligo Aat II - oligo Pvu LI (for ts2) 
PCR4: oligo BssH II - oligo Pvu 1.2 (for ts2) 



100 pmol of each oligo and 5 ng of the template DNA were used in the 100 jal 
reaction mixture, containing 4 units of Taq or Pwo polymerase, 0.1 nM dNTPs and 1.5 mM 
MgS0 4 , The polymerase was added directly before starting the PCR reaction (starting 

25 point 95°C). The temperature cycles were as following: 95°C for 1 .5 min, followed by 5 
cycles of 95°C (30 seconds), 54°C (30 seconds), 72°C (120 seconds) and followed by 25 
cycles of 95°C (30 seconds), 64°C (30 seconds), 72°C (120 seconds). 

The four PCR (PCR 1 to 4) fragments were purified using Qia spin PCR kit 
(QIAGEN, Inc., Chatsworth, CA 913 1 1, USA) and finally digested in the appropriate 

30 buffer using 10 units of Dralll and Hpal, Hpal and BssHII, Dralll and Pvul, or BssHII and 
Pvul respectively. The digestion was performed for 6 hours at 37°C, before the DNA 
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fragments were gel-purified (gene-clean; Bio 101 Inc., Vista, CA 92083, USA). The 
purified PCR fragments 1 and 2 and the framents 3 and 4 were finally ligated into Dralll / 
BssHII digested and gel-purified pDH-EB (Bredenbeek et al.,J. Virology Nov. 1993, p. 
6439-6446). The correct sequence of the obtained vectors was checked by restriction 
5 enzyme digestion and was verified by DNA sequencing using the following primers: 

oligo seq ts2: TTCCAAGCCATCAGAGGGG 
oligo seq ts20: AGATTAGCACCTCAGGACCG 



10 The correct vectors DH-EB ts2 and DH-EB ts20 were digested for 4 hours at 37°C 

with 10 units of Bsp68I and EcoRI in the appropriate buffer. The 4.5 kb DNA fragment of 
pDH-EB ts 2 and the 3.4 kb fragment of pDH-EB ts20 were gel-purified (as described 
above) and ligated to create DH-EB ts2 5 20. The correct sequence of the obtained vector 
was checked by restriction enzyme digestion and was verified by DNA sequencing using 

15 the primers oligo seq ts2 and oligo seq ts20 (described above). 

Production Of Milligram Quantities Of Secreted Proteins. A batch of BHK 21 
cells is infected with the virus supernatant containing the protein of interest. Subsequent 
purification will yield several milligrams of the pure protein. 



20 B. Example 1: Compartment Screening and Identification of Nucleic Acids 

Encoding Proteins with a Predetermined Localization 

Confluent cultures of BHK cells were infected with viral particles containing 

two expression systems with two different nucleic acids (pSinRepS EPO and pSinRepS 

lacZ), both packaged with the helper construct DHBB. One culture was not infected as a 

25 negative control (Vial 1, lane 1 in FIGURE 1 A). Twenty-four hours postinfection, the cells 
were starved in starvation medium lacking Methionine and Cysteine for 30 minutes. A 
pulse of lO^Ci 35S Met/Cys was added and replaced by fresh medium after 30 minutes. 
After an incubation of two hours, the cells and the supernatant were collected. The cells 
were resuspended in Laemmli buffer and boiled for 10 minutes. Five hundred |il 

3 q supernatant was added to 600 ^1 methanol and 200 ^1 chloroform. The upper phase was 
discarded and precipitated protein was pelleted by adding 800^1 of methanol. Both the 
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precipitated supernatant proteins and cell pellets were resuspended in 25 (il Laemmli 
buffer, and the proteins separated on a 15% SDS polyacrylamide gel. After drying, the gel 
was exposed overnight to an X-ray film. The labelled proteins found in the supernatant 
were run on the gel presented in FIGURE 1A; FIGURE IB depicts labelled proteins found 

5 in the cell pellet corresponding to the supernatants in lanes 1 and 2 of FIGURE 1 A. 

Identification of a nucleic acid encoding a secreted protein was possible by 
detection of a band on the X-ray film (FIGURE 1 A), which is the result of a unique 
labelled protein in the supernatant encoded by the exogenous expression system. FIGURE 
1 A, lane 1 shows that uninfected cells (containing no exogenous expression system) secrete 

10 several unidentified proteins. The corresponding cell pellet is shown in FIGURE IB where 
many proteins were labelled and a multitude of bands is visible. The blank lane 2 in 
FIGURE 1 A indicates the presence of an expression system containing a nucleic acid 
encoding an intracellular protein in vial 2. Accordingly, lane 2 in FIGURE IB confirms 
this finding by the presence of a unique band which stems from a single labelled protein 

15 from the cell pellet. Expression of all other endogenous proteins was suppressed (compare 
with lane 1) allowing unique labelling and a clear identification of the nucleic acid of vial 2 
as a nucleic acid coding for an intracellular protein (which confirms the published finding 
that lac Z is in fact an intracellular protein). A strong band resulting from a single 
radioactive protein in the supernatant in FIGURE 1 A, lane 6 indicates the presence of an 

20 expression system containing a nucleic acid encoding a secreted protein. Vial 6 (lane 6 in 
FIGURES 1 A and IB) was infected with the pSinRepS EPO construct. This finding 
confirms that EPO is a secreted protein. Endogenous protein expression was again 
suppressed allowing unique labelling and a clear identification of the nucleic acid of vial 6 
as a nucleic acid coding for a secreted protein. Lanes 3-5 show labelled secreted proteins 

25 from mixtures of the two expression systems. Increasing proportions of the EPO 

expression system with 10%, 50% and 90% give rise to a more uniquely labelled protein in 
the supernatant. These results demonstrate that compartment screening can identify nucleic 
acids which encode proteins of a predetermined cellular localization. 

30 
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C. Example 2: Separation of Labelled Viral Particles from Secreted 
Protein for Semisolid Medium Screening. 

For optimal identification of expression systems encoding secreted proteins 
in semisolid medium, it is preferred that a) endogenous protein synthesis is suppressed; and 
b) labelled viral particles and other labelled soluble viral protein are prevented from 
interfering with the screen. Confluent layers of BHK 21 cells in 35mm dishes were 
infected with pTEC5'2JCAT or pTE5'2JEPO double subgenomic infective viral particles at 
an MOI of 5. As a negative control, uninfected BHK 21 cells were used. Fourteen hours 
post infection, the medium was removed and the cells were overlaid with 4mm of 0.8% 
agarose in RPMI 1640 medium containing lO^Ci 35 SMet/Cys per dish. After gelling, the 
cultures were overlaid with 1 ml of Cys/Met free RPMI, and the medium was collected 
after 2 hours, 4 hours and 8 hours. The protein was precipitated and separated on a 15% 
SDS gel. The gel was exposed to an X-ray film overnight. It could be demonstrated that 
.endogenous protein synthesis was. inhibited again and that labelling was specific (FIGURE 
2: compare lanes 1-3 with lanes 4-9). Moreover, diffusion of viral particles was also 
inhibited since the characteristic pattern of the three structural viral proteins (capsid, El and 
E2) could not be detected (lanes 4-9). 

In both supernatants of the virally infected cultures, a labelled protein with a size of 
about 60 kD was present (lanes 4-9). It was speculated that this protein is a viral protein 
and that it might result from proteolytic cleavage of one of the glycoproteins of Sindbis 
virus. In the case of pTE5'2JEPO after 8 hours, an additional protein with the size of EPO 
was detected (lane 9) demonstrating the principal feasibility of semi-solid medium 
screening. This experiment demonstrates that labelled viral particles can be separated from 
secreted protein by limited diffusion in 0.8% agarose. However, without further measures, 
the release of soluble viral protein occurs. Such a release is undesirable because it may 
interfere with the phenotypic screen. Example 3 describes a method that inhibits the release 
of unwanted viral proteins. 

D. Example 3: Inhibition of Release of Viral Soluble Proteins 

Confluent layers of BHK cells were infected with double subgenomic 
pTE5'2JCAT viral particles at an MOI of 5. Fourteen hours post infection, the medium was 
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removed and the cells were overlaid with 4 mm of 0.8% agarose in RPMI 1640 medium 
containing 10|iCi 35 SMet/Cys per dish and varying amounts of Mini Protean protease 
inhibitor cocktail solution (seven times concentrated as described in the manual, Boehringer 
Mannheim, Rotkreuz, Switzerland). One hundred (il, 20jil, lOjil, 5jli1 and Ifil were used 

5 (see lanes 1-5 in FIGURE 3). After gelling, the cultures were overlaid with 1 ml of 
Cys/Met free RPMI, and the medium was collected after 4 hours. The protein was 
precipitated and separated on a 15% SDS gel. The gel was exposed to an X-ray film 
overnight. The results (depicted in FIGURE 3) demonstrated that release of viral protein 
was inhibited by addition of the protease inhibitor. At concentrations above 20|il/ml (lanes 

10 i and 2), the release of soluble protein was inhibited showing that neither viral particles nor 
viral protein diffuses through the agarose layer, and that no endogenous or viral protein 
interfered with the screen. 



20 



E. Example 4: Identification of a Nucleic Acid Encoding a Protein with a 
j 5 Predetermined Enzymatic Activity in a Mixture of Nucleic Acids in 

Semi-Solid Medium Screening 

An expression system containing a nucleic acid encoding a protein with 
phosphatase activity was identified in a mixture of two expression systems. pSinRepS lacZ 
and pSinRepS SEAP were packaged individually with the helper DHEB, yielding infective 
particles. The two expression systems were mixed in a ratio of 10:1 (pSinRepS lacZ: 
pSinRepS SEAP). Confluent layers of BHK 21 cells in 35 mm dishes were infected with 
this mixture of expression systems and alternatively only with the pSinRepS lacZ 
expression system. Two hours postinfection, the medium was removed and the cells were 
overlaid with 1.5 ml 0.8% agarose in Ix HP-1 medium 

Two days later, a nitrocellulose filter was placed on top of the agarose for 4 hours. 
The filter was removed, washed in PBS and placed in a solution of 100 mM TrisHCl, 100 
mM NaCl, 370 mg/nitroblue tetrazolium and 250 rng/1 5-bromo-4-chloro-3- 
indolylphosphate (all Sigma) to detect alkaline phosphatase activity. 

In the sample containing 10% of the pSinRepS SEAP expression system, two 
spatially distinct areas with alkaline phosphatase activity could be detected (FIGURE 5). 
The two corresponding plaques that gave rise to the enzymatic conversion of the substrate 
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and a negative control from an area that stained negative on the screen were isolated. The 
cell culture was then fixed in 0.5% glutaraldehyde in PBS and lac Z activity was detected in 
the cells by X-Gal staining as described in the instruction manual "Sindbis expression 
system" Invitrogen, San Diego, USA (not shown). About 20 to 30 plaques stained blue in 
this cell culture reflect the original distribution of expression systems. 

Viruses from the positive alkaline phosphatase plaques and the negative control 
were eluted in PBS overnight by shaking at 4°C. The PBS was added to a fresh culture of 
confluent BHK cells in a 35mm dish and SEAP activity was confirmed 2 days post 
infection by the following assay. Five hundred f-xl 2x SEAP buffer (2mM L-homoarginine, 
0.2 M diethanolamine, 0.1 rnM MgCl 2 in H 2 0) plus 100|il substrate solution (120mM 
nitrophenylphosphate in H 2 0) were mixed with 400 jal of heat inactivated (10 minutes 
60 °C) cell culture supernatant. SEAP activity was revealed by a color change from purple 
to yellow in the sample from the positive areas of the culture, whereas the negative controls 
did not give rise to a color change. This example demonstrates that an expression system 
containing a nucleic acid encoding a protein with a predetermined enzymatic activity can be 
isolated from a mixture of expression systems in semisolid medium screening. 

F. Example 5: Stable Amplification of Expression Systems 

The positive viral particles (containing the nucleic acid coding for SEAP) 
from Example 4 were amplified over three passages. One percent of the supernatant from 
the 35 mm dish of Example 4 was used to infect an 80% confluent T25 (25 cm 2 ) cell 
culture flash containing BHK 21 . Twenty-four hours post infection, the supernatant was 
removed and assayed for SEAP activity. SEAP activity in this supernatant was proven by a 
fast color change in this sample. The particles were passaged in this way for a total of four 
times, thereby amplifying the expression system by a factor of 10 8 within 5 days. This 
result demonstrates that the expression system can be amplified rapidly and efficiently to 
yield large amounts of a protein of interest. Moreover, this process yields substantial 
amounts of the expression system interest for further investigation of the protein's function 
and usefulness in any other mammalian system. 
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G. Example 6: Identification Of A Nucleic Acid Encoding A Secreted 
Protein In A Mixture Of Nucleic Acids In Semi-Solid Medium 
Screening 

An expression system containing a nucleic acid encoding a secreted protein 
with phosphatase activity was identified in a mixture of two expression systems. 
pSinRepS'LacZ and pSinRepS' SEAP were packaged individually with the helper DHEB, 
yielding infective particles. 90% confluent BHK 21 cell layers in 60 mm dishes (Easy Grip, 
Falcon 3004, Becton Dickinson and Company, England) were incubated at 37(C for 2 
hours with 1 ml of lx HP-1 medium containing approximately 780 plaque forming units 
("pfu") of pSinRepS LacZ and 20 pfu of pSinRep 5 SEAP (secreted alkaline phosphatase). 
After removal of the medium, the cells were overlaid with 3 ml 41°C warm 0.8% agarose 
(Carl Roth GmbH, Karlsruhe, Germany) in HP-1 medium and were then incubated for 2 
days at 37°C in 5% C0 2 . 

The agarose was overlaid with 1 ml 1 x RPMI 1 640 Met'/Cys" for 1 0 min before 
being replaced by 1 ml of fresh lx RPMI 1640 Met'/Cys". After 30 min starvation, the 
medium was replaced with 0.8 ml lx RPMI 1640 MetVCys' plus 20 ^Ci of 35 S Met/Cys. 
After labeling for 4 hour, the agarose overlay was washed 3 times for 10 min with 1 ml lx 
HP-1, before the 54 mm pre- wetted nitrocellulose filter (0.45 membrane, BA85, 
Schleicher & Schuell, Germany) was applied for 16 to 1 8 hours. 

Diffusion blotting was proceeded during 1 8 hours, before SEAP activity was 
detected by AP staining (10 ml 100 mM TrisHCl, 100 mM NaCl, 370 mg 
nitroblue-tetrazolium and 250 mg/1 5-bromo-4chloro-3indolylphopshate (all Sigma)) 
resulting in violet spots (FIGURE 7). The AP-developing was stopped by removal of the 
AP staining solution and with subsequent washing steps (3 times with TBS containing 0.1% 
Triton X-l 00). The nitrocellulose was then dried and exposed for at least 24 hours to an 
X-ray-film (Hyperfilm Pmax, Amersham, Sweden) before being developed with an AGFA 
Curix 60 (Schenk, Winterthur, Switzerland) machine (FIGURE 7). 

The coordinates of the radioactive spots were determined and the equivalent regions 
were picked from the agarose layer with a 1 0 jil tip. The eluted virus (as described above) 
was passaged in a 60% confluent BHK 21 12-well plate. Presence of a gene encoding 
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SEAP in the insert of the eluted virus was determined by spotting 4 fil supernatant on a 
nitrocellulose membrane and by AP staining as described above. 

The correspondance of the spots on the Secreted Alkaline Phosphatase stained filters 
with the spots on the X-ray film clearly demonstrated that nucleic acids encoding a secreted 
5 protein can be identified in a mixture of nucleic acids using semi-solid medium screening. 
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H. Example 7: Identification Of Nucleic Acids Encoding Secreted Proteins 
In A Sindbis Virus Library Representing A cDNA Library In Semi- 
Solid Medium Screening 

A directed cDNA library from ECV 304 cells (ATCC CRL-1998, human 
endothelial cell line) containing a Not I site at the 3' end and a 5' blunt end was ligated into 
Stu I, Bsp 120 L digested pSinRep 5 vector and 10 7 primary clones were obtained after 
transformation of E.coli DHlOb. After preparation of plasmid DNA and linearization with 
Not I, the library was transcribed in vitro as described before and packaged with the helper 
DHEB, yielding infective particles. 90% confluent BHK 2 1 cell layers in 60 mm dishes 
(Easy Grip, Falcon 3004, Becton Dickinson and Company, England) were incubated at 
37°C for 2 hours with 1 ml of lx HP-1 medium containing approximately 800 pfu of 
pSinRep 5 ECV 304 library. After removal of the medium, the cells were overlaid with 3 
ml 41 °C warm 0.8% agarose (Carl Roth GmbH, Karlsruhe, Germany) in HP-1 medium and 
were then incubated for 2 days at 37°C in 5% C02. 

The agarose was overlaid with 1 ml lx RPMI 1640 Met /Cys" for 10 min which was 
replaced by 1 ml of fresh lx RPMI 1640 MetVCys*. After 30 min starvation, the medium 
was replaced with 0.8 ml lx RPMI 1640 Met /Cys" containing 20 |iCi of 35 S Met/Cys. 
After 4 hour labeling, the agarose overlay was washed 3 times for 10 min with 1 ml lx 
HP-1, before the 54 mm pre-wetted nitrocellulose filter (0.45 \xm membrane, BA85, 
Schleicher & Schuell, Gremany) was applied. 

Diffusion blotting was proceeded during 1 8 hours at 37°C, then the nitrocellulose 
was washed three (3) times 10 minutes with TBS containing 0. 1% Triton X-100 before the 
membrane was dried and exposed for at least 24 hours to an X-ray film (Hyperfilm pmax, 
Amersham, Sweden), which was subsequently developed with an AGFA Curix 60 (Schenk, 
Winterthur, Switzerland) machine. 
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The coordinates of the radioactive spots were determined and the equivalent regions 
were picked from the agarose layer with a 10 ^1 tip. The virus was eluted as decribed 
before and was passaged on 60% confluent BHK 21 cells in 12-well plates. After 
reamplification of the virus particles on 60% confluent BHK 21 cells in 6-well plates, the 

5 viral RNA was isolated using the "high pure viral RNA kit" (Boehringer Mannheim, 

Mannheim, Germany). RT-PCR was carried out with the Superscript one-step RT-PCR kit 
according to the manufacturer (Gibco/BRL, Life Technologies AG, Basel, Switzerland) and 
the primers described before. The RT-PCR product was digested with the restriction 
enzymes Bspl20I and EcoRI, gel-purified and ligated in pBluescript (digested with EcoRI 

10 and Bspl20I) and finally sequenced using the oligos M -40 forward and reverse primer" of 
the multiple cloning site. 



I. Example 8: Identification Of A Nucleic Acid Encoding A Secreted 
Protein In A Mixture Of Nucleic Acids In Semi-Solid Medium 
Screening Using Sindbis TS Mutants 
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Recombinant SEAP Sindbis Virus Stocks. EcoRI-linearized DH-EB helper 
constructs (DH-EB, DH-EB ts2, DH-EB ts20, DH-EB ts2,20) and NotI -linearized pSinRep 
5 SEAP and pSinRepS LacZ DNA were made RNase-free by purification over QiaQuick 
PCR purification columns (QIAGEN AG, Basel, Switzerland) and elution with DEPC-H 2 0. 
SP6 in vitro transcription was carried out according to the manufacturer Invitrogen 
(Invitroscript CAP Kit, Invitrogen BV, NV Leek, The Netherlands). 5 jig of SinRep 5' 
SEAP and 5 jig of the helper transcript were co-el ectroporated into BHK 21 cells according 
to Invitrogen (Invitroscript CAP Kit, Invitrogen BV, NV Leek, The Netherlands). The 
supernatants were harvested 30 hours post-electroporation (incubated at 30°C, 5% C0 2 ) and 
were assayed for SEAP activity. 4 \i\ of supernatant was spotted on nitrocellulose 
membrane (Schleicher & Schuell) and SEAP activity was detected by AP staining method 
as described above (FIGURE 8). 

First Passage Of ts Mutants. The recombinant virus stocks were passaged in a 
60% confluent 6-well plate with BHK 21 cells. 100 |il of each stock plus 900 \x\ of HP-1 
were added for two (2) hours to the BHK cells at 30°C, then the virus supernatant was 
replaced with 2 ml HP-1. The cells were incubated for 24 hours either at the permissive 
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temperature at 30°C or at the nonpermissive temperature 37°C before the supernatant was 
assayed for SEAP activity (as described before) (FIGURE 8). 

Plaque Assay For The ts Mutants, Dilution series of the harvested virus particles, 
see, supra, were carried out in 1 ml Turbodoma HP-1 (1:10 4 , 1: 5x 10\ 1 : 1 0 s , 1: 5x 10 5 , 

5 1 :10 6 , 1 : 5x 10 6 ) on 90% confluent BHK 21 cells in 60 mm tissue culture dishes (Easy Grip, 
Falcon 3004, Becton Dickinson and Company, England). The cells were infected at 30°C 
for 2 hours with the diluted virus at Dilution series of the harvested virus particles, see, 
supra, were carried out in 1 ml Turbodoma HP-1 (1:10 4 , 1: 5x 10 4 , 1:10 5 , 1: 5x 10\ 1:10 6 , 
1 : 5x 10 6 ). The supernatant was then replaced with 0.8% agarose in HP-1 and incubated for 

10 . 2 days at 37°C or 30°C in 5% C0 2 . The plaques were then counted. 

Agarose Blot Assay With A Mixture Of pSinRep SEAP And pSinRep LacZ An 
expression system containing a nucleic acid encoding a secreted protein was identified in a 
mixture of two expression systems. pSinRepS'LacZ and pSinRepS' SEAP were packaged 
individually with the ts mutant helpers DHEB (see TABLE I), yielding infective particles. 

15 90% confluent BHK 21 cell layers in 60 mm dishes (Easy Grip, Falcon 3004, Becton 
Dickinson and Company, England) were incubated at 30°C for 2 hours with 1 ml of lx 
HP-1 medium containing approximately 780 plaque forming units ("pfu") of pSinRep 5 
LacZ / DHEB ts and 20 pfu of pSinRep 5 SEAP (secreted alkaline phosphatase)/ DHEB ts. 
After removal of the medium, the cells were overlaid with 3 ml 41°C 0.8% agarose (Carl 

20 Roth GmbH, Karlsruhe, Germany) in HP-1 medium and were then incubated for 2 days at 
30°C in 5% C0 2 . 

Two (2) hours pre-starvation, the cells were shifted up to 40°C in 5% C0 2 . The 
agarose was overlaid with 1 ml lx RPMI 1640 MetVCys' for 10 min before the 1 ml RPMI 
medium was replaced with 1 ml of fresh lx RPMI 1640 MetVCys'. After 30 min starvation 

25 at 40°C, the medium was replaced with 0.8 ml lx RPMI 1640 MetVCys* containing 20 fiCi 
of 35 S Met/Cys. After 4 hour labeling at 40°C, the agarose overlay was washed 3 times 10 
min with 1 ml lx HP-1, before a 54 mm pre-wetted nitrocellulose filter (0.45 \xm 
membrane, BA85, Schleicher & Schuell, Gremany) was applied. 

Diffusion blotting was proceeded at 40°C during 1 8 hours, before SEAP activity 

30 W as detected by AP staining (10 ml 100 mM Tris-HCl, 100 mM NaCi, 370 mg 

nitroblue-tetrazolium and 250 mg/1 5-bromo-4chloro-3indolylphopshate (all Sigma)) 
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resulting in violet spots (FIGURE 8). The AP staining was stopped by removing the AP 
solution and by susequent washing steps (3 times TBS containing 0.1% Triton XI 00). The 
nitrocellulose was dried and exposed for at least 24 hours to an X-ray film (Hyperfilm 
bmax, Amersham, Sweden) before being developed with a AGFA Curix 60 (Schenk, 
5 Winterthur, Switzerland) machine. 

The coordinates of the radioactive spots were determined and the equivalent regions 
were picked from the agarose layer with a 1 0 \xl pipette tip. 



15 



J. Example 9: Identification Of A Nucleic Acid Encoding A Secreted 
^0 Protein In A Mixture Of Nucleic Acids In Semi-Solid Medium Library 

Screening Using The TS Mutants 

pSinRepS' cDNA library ECV 304 was packaged with the mutant helpers 
DHEB, yielding infective particles. 90% confluent BHK 21 cell layers in 60 mm dishes 
(Easy Grip, Falcon 3004, Becton Dickinson and Company, England) were incubated at 
30°C for 2 hours with 1 ml of lx HP-1 medium containing approximately 800 pfu of the 
pSinRep 5 cDNA library ECV 304. After removal of the medium, the cells were overlaid 
with 3 ml 0.8% agarose (Carl Roth GmbH, Karlsruhe, Germany) in HP-1 medium and were 
then incubated for 2 days at 30°C in 5% CO.. 

Two (2) hours pre-starvation, the cells were shifted to 40°C in 5% CO : . The 
agarose was overlaid with 1 ml lx RPMI 1640 MetVCys* for 10 min before being replaced 
by 1 ml of fresh lx RPMI 1640 MetVCys". After 30 min starvation, the medium was 
replaced with 0.8 ml lx RPMI 1640 MetVCys' plus 20 pCi of 35 S Met/Cys. After 4 hour 
labeling, the agarose overlay was washed 3 times for 10 min with 1 ml lx HP-1, before the 
54 mm pre-wetted nitrocellulose filter (0.45 |im membrane, BA85, Schleicher & Schuell, 
Germany) was applied for 16 to 18 hours. 

Diffusion blotting was proceeded during 1 8 hours, before the nitrocellulose was 
dried and exposed for at least 24 hours to an X-ray film(Hyperfilm Pmax, Amersham, 
Sweden) and then developed with a AGFA Curix 60 (Schenk, Winterthur, Switzerland) 
machine. The coordinates of the radioactive spots were determined and the equivalent 
regions were picked from the agarose layer with a 10 |il tip. 

30 
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K. Example 10: Expression Cloning Using Sindbis Virus: Cloning A 
Receptor For A Known Ligand By FACS Single Cell Sorting 

60% confluent BHK 21 cultures in T150 flasks were infected at 37°C for 2 

hours at an moi of 0.1 in lx HP-1 medium. Different dilutions of pSinRep 5 hIL 13 Ra 

5 (human Interleukin 13 Receptor, subunit alpha; hIL13 Ra digested from pDR2-hIL13Ra 

with Xbal and PvuII, cloned into pSinRepS digested with Xbal and StuI) /DHEB with 

pSinRep 5 LacZ/DHEB (1:0 / 1:100 / 1:100*000) were used with a total moi of 0.1. The 

medium was then replaced with fresh lx HP-1 medium and the cells were incubated at 

37°C in 5% CO 2 for 20 hours. 

jq The cells were then resuspended with 5 ml cell dissociation solution 

(Sigma-Aldrich, Steinheim, Germany) followed by two additional washing steps in HBSS 
containing 1% BSA. 1 ml 1%BSA, 1 ng/ml IL13flag (according to IBR GmbH, Waengi, 
Switzerland) and 5 fig/ml mAb M2 (according to to IBR GmbH, Waengi, Switzerland) in 
HBSS were incubated with the cells at 4°C for 1 5 min. The cells were washed twice with 

15 1.5 ml 1% BSA/HBSS and incubated with 1 ml 1% BSA and 10 ng/ml FITC conjugated 
secondary antibody for 30 min at 4°C. The cells were washed twice with 4°C 1% BSA in 
HBSS and resuspended in HBSS at a concentration of 3x 10 6 / ml. Single cell sorting into 
96-well plates was done for cells with FITC fluorescence intensity above background (see 
FIGURE 9). 

20 FIGURE 9 shows that the Interleukin 13-Receptor alpha is transiently expressed and 

that single cells expressing the correct receptor for a chosen ligand can be sorted by FACS 
analysis. 



L. Example 11: Expression Cloning Using Sindbis Viruses: Cloning A 
Ligand For A Known Receptor 

25 

An infectious Sindbis Virus library is produced in double subgenomic 
Sindbis Virus vectors (pTE) or in pSinRepS copackaged with the DH-EB helper as 
described supra. Confluent BHK 21 cells in 1 50 mm dishes are incubated for two (2) hours 
with HP-1 medium containing about 5,000 pfu of the infectious sindbis virus library. After 
removal of the medium, the cells are overlaid with 2 mm of 0.8% agarose in lx HP-1 

30 

medium and are incubated for three days at 37 °C. A nitrocellulose filter is placed on top of 
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the agarose and incubated for at least three hours. The membrane is removed and 
subsequently washed with TBS. After agitating the filters for 15 minutes in TBS, the filter 
is blocked by incubation in 1% skim milk for one (1) hour. After three washing steps in 
TBS, the filters are incubated in TBS containing the radiolabeled solubilized receptor for 
5 at least one (1) hour. After several washing steps in TBS, the filters are dried and exposed 
to X-ray films. Dark spots indicate colonies expressing a binding partner of the chosen 
receptor of interest. 

M. Example 12: Cloning A Cell Associated Protein For Which The Ligand 
jq Is Known 

An infectious Sindbis Virus library is produced in double subgenomic 

Sindbis Virus vectors (pTE) or in pSinRepS copackaged with the DH-EB helper as 

described in the Materials and Methods section, supra. BHK 21 cells are grown on 

nitrocellulose filters in 150 mm dishes to confluency. The cells are incubated for two hours 

j 5 with HP-1 medium containing about 5,000 pfu. After removal of the medium, the cells are 
overlaid with 2 mm of 0.8% low melting agarose in lx HP-1 medium and are incubated for 
two days at 37°C. After this incubation, a nitrocellulose filter is placed on top of the 
agarose to capture viral particles. After 24 hours of incubation, the nitrocellulose filters are 
removed and placed in new 150 mm dishes where a new culture of BHK cells is added at a 

2Q cell density of 400,000 cells per cm 2 in medium containing 1 0% FCS. After adhesion of 
the cells, this culture is again overlaid with 2 mm agarose and incubated at 37°C. This 
culture serves as a stock of virus particles for later isolation. The initial plate is heated to 
40 °C to melt the agarose. After removal of the melted agarose, the cells are fixed to the 
membrane by known methods (0.5% glutaraldehye in PBS, 3% paraformaldeyde, -20 °C 

25 methanol and the like) according to the expected localization (nucleus, cytoplasmic 

membrane, etc.) and nature of the protein that is to be cloned. After several washing steps 
in TBS, the filters are incubated in TBS containing the radioactive labelled ligand. After 
several washing steps, the filters are dried and exposed to X-ray films. Dark spots indicate 
colonies expressing a binding partner of the chosen ligand. 

30 

N. Example 13: Discovery And Expression Of Novel Membrane Proteins 
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A Sindbis Virus library in pSinRep5 as described in Example 6 is 
coelectroporated with the DHBB helper to produce one way virions described supra. After 
determination of the virus titer, confluent cultures of 987 BB neo cells (See Table I and 
SEQ ID NO:2 for a description of the vector by which BHK 2 1 cells were transfected to 
5 yield the 987BBneo cell line after selection in 200jig/ml G41 8). After removal of the 

medium, the cells are overlaid with 2 mm of agarose to allow plaque formation. After three 
days, a coated polyester mat as described in Motobu et al. (in E.C. Beuvery et al (eds.) 
"Animal Cell Technology; Developments towards the 21th century", pp. 81 1-815 (1995) 
Kluwer Academic Publishers) is placed on top of the agarose and BHK 21 cells are added 

10 on top at a cell density of 400,000 cells/cm 2 at a concentration of 1,000,000 cells/ml of 
medium. After adhesion of the cells, the same amount of fresh medium is added and the 
culture is incubated for an additional 12 hours. The liquid medium is removed and 
replaced by labelling medium (10 jiCi of 35 S methionine/cysteine). After 12 hours of 
incorporation, the labelling medium is removed and the culture is washed by adding 1 0 ml 

15 of normal medium. This medium is replaced every 10 minutes for a total of two hours. 
After removal of all liquid medium, the cells are overlaid with a 2 mm layer of medium 
containing 0.8% agarose and 0.1% trypsin. After gelling of the agarose, a nitrocellulose 
membrane is placed on top of the agarose and the culture is incubated for an additional one 
to six (6) hours. Proteolytically released fragments of membrane proteins diffuse to the 

20 membrane and are captured on the nitrocellulose filter. Autoradiographing the filter 
indicates the clones coding for membrane proteins. 



O. Example 14: Discovery And Expression Of Novel Organelle Specific 
Proteins 

25 An infectious Sindbis Virus library is produced in double subgenomic 

Sindbis Virus vectors (pTE) or in pSinRepS copackaged with the DH-EB helper as 
described in Materials and Methods. BHK 21 cells are grown in 150 mm dishes to 
confluency. The cells are incubated for two hours with HP-1 medium containing about 
5,000 pfu. After removal of the medium, the cells are overlaid with 2 mm of 0.8% low 
melting agarose in lx HP-1 medium and are incubated for one to two days at 37°C. After 
this incubation, a nitrocellulose filter is placed on top of the agarose to capture viral 
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particles. After at least 12 (twelve) hours of incubation, the nitrocellulose filters are 
removed and placed in new 1 50 mm dishes where a new culture of BHK cells is added at a 
cell density of 400 ? 000 cells per cm 2 in medium containing 10% FCS. After adhesion of 
the cells, this culture is again overlaid with 2 mm agarose and incubated at 37 °C. This 

5 culture serves as a stock of virus particles for later isolation. 

The initial plate is overlaid with a second layer of 0.8% agarose in RPMI 1 640 
medium containing 40 |iCi of 35 S labelled mix of methionine and cysteine. After two hours 
of labelling, the plates are heated to 40 °C to melt the agarose. After removal of the melted 
agarose, the cells are lysed by known methods to yield intact organelles. A blocked 

10 nitrocellulose filter with immobilized antibodies against a surface protein of the organelle 
to be captured is placed on top of the homogenate. After several washing steps, the filters 
are dried and exposed to X-ray films. Autoradiographing the filter indicates the clones 
coding for proteins localized in the desired organelle. 

15 p. Example 15: Expression Cloning Using Sindbis Virus: Cloning A New 

Receptor For A Known Ligand By FACS Single Cell Sorting 

60% confluent BHK 21 cells in T 150 flask are infected for 2 hours with 

pSinRep ECV cDNA library at an moi of 0.1 in 20 ml Turbodoma HP-1 medium. The 

medium is then replaced with 40 ml fresh lx HP-1 medium and the cells are incubated at 

2Q 37°C in 5% C0 2 for 20 hours. 

The cells are then resuspended with 5 ml cell dissociation solution (Sigma- Aldrich, 
Steinheim, Germany) and washed twice in HBSS containing 1% BSA. The cells are 
incubated in 1 ml HBSS in the presence of 1% BSA, 1 |ig/ml ligand-flag fusion protein and 
5 ng/ml mAb M2 at 4°C for 15 min and washed twice with 1.5 ml 1%BSA/HBSS before 

25 subsequent incubation with 1 ml 1% BSA and 10 |il/ml antiM2 FITC for 30 min at 4°C. 
The cells are washed twice with 1% BSA in HBSS before resuspension in HBSS at a 
concentration of 3x 10 6 / ml. Single cell sorting into 96-well plates is done for cells with 
FITC activity above background. 

The supernatant of the 96-well plates are used for a plaque assay in 6-well plates 

30 with 90% confluent BHK 21 cells. Plaques are picked with 1 0 ]i\ pipette tips and the eluted 
as described before. The eluted virus is passaged in a 60% confluent BHK 21 12-well plate 
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in 1 ml Turbodoma HP-1 at the permissive temperature. The insert of the eluted virus is 
determined by reamplification of the virus particles in 2 ml Turbodoma HP-1 medium in a 
6-well plate containing 60% BHK 21 cell layers. Viral RNA is isolated using a "high pure 
viral RNA kit" (Boehringer Mannheim, Mannheim, Germany) and RT-PCR is done with 
5 the Superscript one-step RT-PCR kit according to the manufacturer (Gibco/BRL, Life 
Technologies AG, Basel, Switzerland) and the primers as described before. The RT-PCR 
product ligation into pBluescript and the sequence determination are done as described 
above. 



10 Q : Example 16: Expression Cloning Of A Ligand For A Known Binding 

Partner 

pSinRepS' EPO and pSinRep 5 LacZ were packaged with the helper DHEB, 
yielding infective particles. 90% confluent BHK 21 cell layers in 60 mm dishes (Easy 
Grip, Falcon 3004, Becton Dickinson and Company, England) were incubated at 37°C for 

1S two (2) hours with 1 ml of lx HP-1 medium containing approximately 25 to 30 pfu of 
pSinRep EPO / DH-EB and approximately 180 to 200 pfu of pSinRep 5 LacZ / DH-EB. 
After removal of the medium, the cells were overlaid with 3 ml 41°C warm 0.8% agarose 
(Carl Roth GmbH, Karlsruhe, Germany) in HP-1 medium and the plates were incubated for 
two (2) days at 37°C in 5% C0 2 . 

2Q A pre-wetted 54 mm nitrocellulose filter (0.45 |im membrane, BA85, Schleicher & 

Schuell, Germany) was applied for sixteen (16) hours, before the nitrocellulose was 
blocked with 1% BSA in TBS for two (2) hours. The first antibody (anti-EPO, polyclonal, 
Research Diagnostics, Inc., USA) specific for the screened secreted protein was incubated 
for one (1) hour at the dilution of 1 :3000 in TBS (according to the manufacturer). The 

25 membrane was then washed three (3) times with TBS 0.05% Tween before the second 
antibody (AP conjugated anti-rabbit IgG, Jackson ImmunoResearch Laboratories, Inc., 
USA) was incubated with the filter at a dilution of 1 :4000 in TBS 1% BSA. After washing 
the membrane with TBS 0.05% Tween, AP staining was carried out as described above. 

The plaques corresponding to the AP positive spots (see FIGURE 1 0) were picked 

3 Q from the agarose layer with a 10 jal pipette tip. Elution of the virus was done as described. 
The virus was passaged in a 60% confluent BHK 21 12-well plate. The insert of the eluted 
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virus was determined by reamplifying of the virus particles in a 60% BHK 21 6-well plate 
before a dot blot against EPO was done as described above. 

FIGURE 10 shows, that a specific secreted ligand for a known binding partner 
(receptor, antibody) can be cloned successfully and that the system can be efficiently used 
5 to screen viruses representing cDNA libraries. 

All references cited within the body of the instant specification are hereby 
incorporated by reference in their entirety. 

10 
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CLAIMS 

WHAT IS CLAIMED IS: 

5 1 . A method for identifying a recombinant nucleic acid encoding an exogenous 

protein having a property of interest comprising: 

a. providing a composition comprising a plurality of eukaryotic host cells, 
wherein each host cell has an expression system comprising a different member, each 
member comprising a recombinant nucleic acid encoding an exogenous protein operatively 

10 linked to a control element; 

b. culturing said eukaryotic host cells under conditions where said exogenous 
protein is expressed while expression of endogenous proteins of said eukaryotic host cell is 
suppressed; and 

c. identifying a member comprising a recombinant nucleic acid encoding an 
15 exogenous protein having the property of interest. 

2. The method of Claim 1, wherein the property is a predetermined cellular 
localization. 



20 3. The method of Claim 2, wherein the predetermined cellular localization is 

the extracellular space. 

4. The method of Claim 2, wherein the predetermined cellular localization is 
the cell membrane. 

25 

5. The method of Claim 2, wherein the predetermined cellular localization is 
selected from the group consisting of nucleus, lysosomes, mitochondria, endoplasmic 
reticulum, Golgi apparatus, peroxysomes, endosomes and cytoplasm. 

30 6. The method of Claim 1 , wherein the property is a binding affinity to a 

binding partner. 
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7. The method of Claim 1, wherein the property is an enzymatic activity. 

8. The method of Claim 1 , wherein the property is a structural property. 

5 9. The method of Claim 1 , wherein said exogenous protein is labelled while 

expression of said endogenous proteins is suppressed. 

10. The method of Claim 9, wherein said exogenous protein is labelled with a 
marker selected from the group consisting of radioactive marker and spectroscopically 

1 0 detectable marker. 

1 1 . The method of claim 1 0, wherein said radioactive marker is selected from 
the group consisting of radioactive amino acids and radioactive sugars. 

15 12. The method of Claim 1, wherein each of said members is expressed in an 

individual eukaryotic host cell. 

13. The method of Claim 9, wherein said control element is derived from a 

virus. 

20 

14. The method of Claim 13, wherein said virus suppresses expression of 
endogenous proteins. 

15. The method of Claim 13, wherein said virus is an alpha virus. 

25 

16. The method of Claim 15, wherein said Alpha Virus is selected from the 
group consisting of Sindbis Virus, Semliki Forest Virus, and Venezuelan Equine 
Encephalitis Virus. 

30 17. The method of Claim 13, wherein said expression system further directs the 

generation of viral particles. 
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] 8. The method of Claim 1 7, further comprising the step of separating said viral 
particles from said exogenous protein. 

19. The method of Claim 18, wherein said viral particles are separated from said 
5 exogenous protein through filtration, chromatography or precipitation. 

20. The method of Claim 14, wherein said control element lacks at least one 
function required for propagation of said virus. 

10 21. The method of Claim 20, wherein the function required for propagation of 

said virus is provided through a helper function. 

22. The method of Claim 21 , wherein said helper function is provided by a viral 
packaging cell line or a helper virus particle. 

15 

23. The method of Claim 1, wherein said control element is derived from a virus 
which is capable of directing expression of said nucleic acid encoding an exogenous protein 
while expression of endogenous proteins is suppressed. 

20 24. The method of Claim 23, wherein expression of said endogenous proteins is 

suppressed by inhibiting transcription or translation of nucleic acids encoding endogenous 
proteins. 

25. The method of Claim 1 , wherein said control element promotes transcription 
25 of said recombinant nucleic acid encoding said exogenous protein in the presence of an 
inhibitor of cellular RNA Polymerase II, and wherein in step b, said eukaryotic host cells 
are cultured in the presence of an effective amount of an inhibitor of cellular RNA 
Polymerase II. 

30 26. The method of Claim 25, wherein said inhibitor is selected from the group 

consisting of Actinomycin D, Aflatoxin Bl, and Amatoxin. 
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27. The method of Claim 17, wherein said viral particles produce solubilized 
viral protein, and wherein production of said solubilized viral protein is reduced. 



28. The method of Claim 27, wherein generation of solubilized protein of said 
5 virus is reduced through use of a protease inhibitor. 

29. The method of Claim 27, wherein generation of solubilized protein of said 
virus is reduced through use of a mutant virus strain. 

*0 30. The method of Claim 29, wherein said mutant virus strain is selected from 

the group consisting of the temperature sensitive Sindis mutants ts20, tslO, and ts23. 

3 1 . The method of Claim 27, wherein generation of solubilized protein of said 
Virus is reduced through use of a cell line deficient in cleavage of a protein of said virus. 

15 

32. The method of Claim 3 1 , wherein said cell line is deficient in cleavage of 
protein PE2 to E2 and E3. 

33. The method of Claim 1, wherein said nucleic acid encoding an exogenous 
20 protein is derived from a nucleic acid library comprising a plurality of distinct nucleic acids 

encoding polypeptides. 

34. The method of Claim 1 , wherein said plurality of members is screened using 
physical separation of said eukaryotic host cells each carrying one individual member of 

25 said plurality of members. 



35. The method of Claim 34, wherein said physical separation is achieved by 
separating said eukaryotic host cells in a semi-solid medium. 
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36. The method of Claim 34, wherein said physical separation is achieved by 
placing eukaryotic host cells in separate compartments, wherein each compartment contains 
host cells carrying identical expression systems of said plurality of expression systems. 

5 37. The method of Claim 1 , wherein said expression system is capable of self- 

replication. 

38. A method for identifying a recombinant nucleic acid encoding an exogenous 
protein having a property of interest comprising: 

10 a. providing a composition comprising a plurality of eukaryotic host cells, 

wherein each host cell has an expression system comprising a different member, each 
member comprising a recombinant virus having a recombinant nucleic acid encoding an 
exogenous protein operatively linked to a control element; 

b. culturing said eukaryotic host cells under conditions where said exogenous 
15 protein is expressed; and 

c. identifying a member comprising a recombinant nucleic acid encoding an 
exogenous protein having the property of interest. 

39. The method of Claim 38, wherein the property is a predetermined cellular 
20 localization. 

40. The method of Claim 39, wherein the predetermined cellular localization is 
the extracellular space. 

25 41 . The method of Claim 39, wherein the predetermined cellular localization is 

the cell membrane. 

42. The method of Claim 39, wherein the predetermined cellular localization is 
selected from the group consisting of nucleus, lysosomes, mitochondria, endoplasmic 
30 reticulum, Golgi apparatus, peroxysomes, endosomes and cytoplasm. 
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43. The method of Claim 38, wherein the property is a binding affinity to a 
binding partner. 

44. The method of Claim 38, wherein the property is an enzymatic activity. 

5 

45. The method of Claim 38, wherein the property is a structural property. 

46. The method of Claim 38, wherein the eukaryotic host cells are cultured 
under conditions where said exogenous protein is expressed while expression of 

10 endogenous proteins of the eukaryotic host cell is suppressed. 

47. The method of Claim 38, wherein said exogenous protein is labelled while 
expression of said endogenous proteins is suppressed. 

15 48. The method of Claim 47, wherein said exogenous protein is labelled with a 

marker selected from the group consisting of radioactive marker and spectroscopically 
detectable marker. 

49. The method of claim 48, wherein said radioactive marker is selected from 
20 the group consisting of radioactive amino acids and radioactive sugars. 

50. The method of Claim 38, wherein each of said members is expressed in an 
individual eukaryotic host cell. 

25 51. The method of Claim 38, wherein said virus suppresses expression of 

endogenous proteins. 

52. The method of Claim 38, wherein said virus is an alpha virus. 
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53. The method of Claim 52, wherein said Alpha Virus is selected from the 
group consisting of Sindbis Virus, Semliki Forest Virus, and Venezuelan Equine 
Encephalitis Virus. 

5 54. The method of Claim 38, wherein said virus further directs the generation of 

viral particles. 

55. The method of Claim 54, further comprising the step of separating said viral 
particles from said exogenous protein. 

10 

56. The method of Claim 55, wherein said viral particles are separated from said 
exogenous protein through filtration, chromatography or precipitation. 

57. The method of Claim 51, wherein said virus lacks at least one function 
15 required for propagation of said virus. 

58. The method of Claim 57, wherein the function required for propagation of 
said virus is provided through a helper function. 

20 59. The method of Claim 58, wherein said helper function is provided by a viral 

packaging cell line or a helper virus particle. 

60. The method of Claim 38, wherein said virus is capable of directing 
expression of said nucleic acid encoding an exogenous protein while expression of 

25 endogenous proteins is suppressed. 

61 . The method of Claim 60, wherein expression of said endogenous proteins is 
suppressed by inhibiting transcription or translation of nucleic acids encoding endogenous 
proteins. 

30 
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62. The method of Claim 54, wherein said viral particles produce solubilized 
viral protein, and wherein production of said solubilized viral protein is reduced. 

63. The method of Claim 62, wherein generation of solubilized protein of said 
5 virus is reduced through use of a protease inhibitor. 

64. The method of Claim 62, wherein generation of solubilized protein of said 
virus is reduced through use of a mutant virus strain. 

10 65. The method of Claim 64, wherein said mutant virus strain is selected from 

the group consisting of the temperature sensitive Sindis mutants ts20, tslO, and ts23. 

66. The method of Claim 62, wherein generation of solubilized protein of said 
virus is reduced through use of a cell line deficient in cleavage of a protein of said virus. 

15 

67. The method of Claim 66, wherein said cell line is deficient in cleavage of 
protein PE2 to E2 and E3. 

68. The method of Claim 38, wherein said nucleic acid encoding an exogenous 
20 protein is derived from a nucleic acid library comprising a plurality of distinct nucleic acids 

encoding polypeptides. 

69. The method of Claim 38, wherein said plurality of members is screened 
using physical separation of said eukaryotic host cells each carrying one individual member 

25 of said plurality of members. 

70. The method of Claim 69, wherein said physical separation is achieved by 
separating said eukaryotic host cells in a semi-solid medium. 

30 
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71. The method of claim 69, wherein said physical separation is achieved by 
placing eukaryotic host cells in separate compartments, wherein each compartment contains 
host cells carrying identical expression systems of said plurality of expression systems. 

72. The method of Claim 38, wherein said recombinant virus is capable of self- 
replication. 



73. A method for generating a genetic expression library encoding proteins 
having a predetermined property of interest comprising: 

a. providing a composition comprising a plurality of eukaryotic host cells, 
wherein each host cell has an expression system comprising a different member, each 
member comprising a recombinant nucleic acid encoding an exogenous protein operatively 
linked to a control element; 

b. culturing said eukaryotic host cells under conditions where said exogenous 
protein is expressed while expression of endogenous proteins of said eukaryotic host cell is 
suppressed; and 

c. identifying the members comprising recombinant nucleic acids encoding 
exogenous proteins having the property of interest. 



74. The method of Claim 73, wherein said nucleic acid encoding an exogenous 
protein is derived from a nucleic acid library comprising a plurality of nucleic acids 
encoding a distinct polypeptides. 

75. A genetic expression library comprising a plurality of expression systems 
encoding proteins located in a predetermined cellular localization, produced by the method 
of Claim 73. 



30 



76. The expression library of Claim 75, wherein the proteins are secreted 
proteins. 
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77. The expression library of Claim 75, wherein the proteins are cell membrane 
associated receptors. 

78. The expression library of Claim 75. wherein the cellular localization is 
5 selected from the group consisting of nucleus, lysosome, and mitochondria. 

79. A library of exogenous proteins produced using the method of claim 73. 

80. A nucleic acid library comprising a population of eukaryotic expression 
10 systems having a plurality of different members, each member comprising a recombinant 

nucleic acid encoding an exogenous protein operatively linked to a control element, 
wherein the control element directs the expression of the exogenous proteins while 
expression of endogenous proteins of said eukaryotic host cells are suppressed. 

15 81. A method for generating a genetic expression library encoding proteins 

having a predetermined property of interest comprising: 

a. providing a composition comprising a plurality of eukaryotic host cells, 
wherein each host cell has an expression system comprising a different member, each 
member comprising a recombinant virus having a recombinant nucleic acid encoding an 

20 exogenous protein operatively linked to a control element; 

b. culturing said eukaryotic host cells under conditions where said exogenous 
protein is expressed; and 

c. identifying members comprising recombinant nucleic acids encoding 
exogenous proteins having the property of interest. 

25 

82. The method of Claim 81, wherein said nucleic acid encoding an exogenous 
protein is derived from a nucleic acid library comprising a plurality of nucleic acids 
encoding a distinct polypeptides. 
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83. A genetic expression library comprising a plurality of expression systems 
encoding proteins located in a predetermined cellular localization, produced by the method 
of Claim 81. 

5 84. The expression library of Claim 83, wherein the proteins are secreted 

proteins. 

85. The expression library of Claim 83, wherein the proteins are cell membrane 
associated receptors. 

10 

86. The expression library of Claim 83, wherein the cellular localization is 
selected from the group consisting of nucleus, lysosome, and mitochondria. 

87. A library of exogenous proteins produced using the method of Claim 8 1 . 

15 

88. A nucleic acid library comprising a population of eukaryotic expression 
systems having a plurality of different members, each member comprising a recombinant 
virus having a recombinant nucleic acid encoding an exogenous protein operatively linked 
to a control element. 
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pSinRep 5 17/37 

AT TGACGGCG T AG TACACACTATTGAATCAAACAGCCGACCAAT TGCACTACCATCACAATGGAGAAGCCAG T AG T AAAC 

GTAGACGTAGACCCCCAGAGTCCGTTTGTCGTGCAACTGCAAAAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGT 

CACTCCAAATGACCATGCTAATGCCAGAGCATTTTCGCATCTGGCCAGTAAACTAATCGAGCTGGAGGTTCCTACCACAG 

CGACGATCTTGGACATAGGCAGCGCACCGGCTCGTAGAATGTTTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGT 

AGTCCAGAAGACCCGGACCGCATGATGAAATACGCCAGTAAACTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTT 

GCATGAGAAGATTAAGGATCTCCGGACCGTACTTGATACGCCGGATGCTGAAACACCATCGCTCTGCTTTCACAACGATG 

TTACCTGCAACATGCGTGCCGAATATTCCGTCATGCAGGACGTGTATATCAACGCTCCCGGAACTATCTATCATCAGGCT 

ATGAAAGGCGTGCGGACCCTGTACTGGATTGGCTTCGACACCACCCAGTTCATGTTCTCGGCTATGGCAGGTTCGTACCC 

TGCGTACAACACCAACTGGGCCGACGAGAAAGTCCTTGAAGCGCGTAACATCGGACTTTGCAGCACAAAGCTGAGTGAAG 

GTAGGACAGGAAAATTGTCGATAATGAGGAAGAAGGAGTTGAAGCCCGGGTCGCGGGTTTATTTCTCCGTAGGATCGACA 

CTTTATCCAGAACACAGAGCCAGCTTGCAGAGCTGGCATCTTCCATCGGTGTTCCACTTGAATGGAAAGCAGTCGTACAC 

TTGCCGCTGTGATACAGTGGTGAGTTGCGAAGGCTACGTAGTGAAGAAAATCACCATCAGTCCCGGGATCACGGGAGAAA 

CCGTGGGATACGCGGTTACACACAATAGCGAGGGCTTCTTGCTATGCAAAGTTACTGACACAGTAAAAGGAGAACGGGTA 

TCGTTCCCTGTGTGCACGTACATCCCGGCCACCATATGCGATCAGATGACTGGTATAATGGCCACGGATATATCACCTGA 

CGATGCACAAAAACTTCTGGTTGGGCTCAACCAGCGAATTGTCATTAACGGTAGGACTAACAGGAACACCAACACCATGC 

AAAATTACCTTCTGCCGATCATAGCACAAGGGTTCAGCAAATGGGCTAAGGAGCGCAAGGATGATCTTGATAACGAGAAA 

ATGCTGGGTACTAGAGAACGCAAGCTTACGTATGGCTGCTTGTGGGCGTTTCGCACTAAGAAAGTACATTCGTTTTATCG 

CCCACCTGGAACGCAGACCTGCGTAAAAGTCCCAGCCTCTTTTAGCGCTTTTCCCATGTCGTCCGTATGGACGACCTCTT 

TGCCCATGTCGCTGAGGCAGAAATTGAAACTGGCATTGCAACCAMGAAGGAGGAAAAACTGCTGCAGGTCTCGGAGGAA 

TTAGTCATGGAGGCCAAGGCTGCTTTTGAGGATGCTCAGGAGGAAGCCAGAGCGGAGAAGCTCCGAGAAGCACTTCCACC 

ATTAGTGGCAGACAAAGGCATCGAGGCAGCCGCAGAAGTTGTCTGCGAAGTGGAGGGGCTCCAGGCGGACATCGGAGCAG 

CATTAGTTGAAACCCCGCGCGGTCACGTAAGGATAATACCTCAAGCAAATGACCGTATGATCGGACAGTATATCGTTGTC 

TCGCCAAACTCTGTGCTGAAGAATGCCAAACTCGCACCAGCGCACCCGCTAGCAGATCAGGTTAAGATCATAACACACTC 

CGGAAGATCAGGAAGGTACGCGGTCGAACCATACGACGCTAAAGTACTGATGCCAGCAGGAGGTGCCGTACCATGGCCAG 

AATTCCTAGCACTGAGTGAGAGCGCCACGTTAGTGTACAACGAAAGAGAGTTTGTGAACCGCAAACTATACCACATTGCC 

ATGCATGGCCCCGCCAAGAATACAGAAGAGGAGCAGTACAAGGTTACAAAGGCAGAGCTTGCAGAAACAGAGTACGTGTT 

TGACGTGGACAAGAAGCGTTGCGTTAAGAAGGAAGAAGCCTCAGGTCTGGTCCTCTCGGGAGAACTGACCAACCCTCCCT 

ATCATGAGCTAGCTCTGGAGGGACTGAAGACCCGACCTGCGGTCCCGTACAAGGTCGAAACAATAGGAGTGATAGGCACA 

CCGGGGTCGGGCAAGTCAGCTATTATCAAGTCAACTGTCACGGCACGAGATCTTGTTACCAGCGGAAAGAAAGAAAATTG 

TCGCGAAATTGAGGCCGACGTGCTAAGACTGAGGGGTATGCAGATTACGTCGAAGACAGTAGATTCGGTTATGCTCAACG 

GATGCCACAAAGCCGTAGAAGTGCTGTACGTTGACGAAGCGTTCGCGtGCCACGCAGGAGCACTACTTGCCTTGATTGCT 

AfeGTCAGGCCCCGCAAGAAGGTAGTACTATGCGGAGACCCCATGCAATGCGGATTCTTCAACATGATGCAACTAAAGGT 

ACATTTCAATCACCCTGAAAAAGACATATGCACCAAGACATTCTACAAGTATATCTCCCGGCGTTGCACACAGCCAGTTA 

CAGCTATTGTATCGACACTGCATTACGATGGAAAGATGAAAACCACGAACCCGTGCAAGAAGAACATTGAAATCGATATT 

ACAGGGGCCACAAAGCCGAAGCCAGGGGATATCATCCTGACATGTTTCCGCGGGTGGGTTAAGCAATTGCAAATCGACTA 

TCCCGGACATGAAGTAATGACAGCCGCGGCCTCACAAGGGCTAACCAGAAAAGGAGTGTATGCCGTCCGGCAAAAAGTCA 

ATGAAAACCCACTGTACGCGATCACATCAGAGCATGTGAACGTGTTGCTCACCCGCACTGAGGACAGGCTAGTGTGGAAA 

ACCTTGCAGGGCGACCCATGGA1TAAGCAGCCCACTAACATACCTAAAGGAAACTTTCAGGCTACTATAGAGGACTGGGA 

AGCTGAACACAAGGGAATAATTGCTGCAATAAACAGCCCCACTCCCCGTGCCAATCCGTTCAGCTGCAAGACCAACGTTT 

GCTGGGCGAAAGCATTGGAACCGATACTAGCCACGGCCGGTATCGTACTTACCGGTTGCCAGTGGAGCGAACTGTTCCCA 

CAGTTTGCGGATGACAAACCACATTCGGCCATTTACGCCTTAGACGTAATTTGCATTAAGTTTTTCGGCATGGACTTGAC 

AAGCGGACTGTTTTCTAAACAGAGCATCCCACTAACGTACCATCCCGCCGATTCAGCGAGGCCGGTAGCTCATTGGGACA 

ACAGCCCAGGAACCCGCAAGTATGGGTACGATCACGCCATTGCCGCCGAACTCTCCCGTAGATTTCCGGTGTTCCAGCTA 

GCTGGGAAGGGCACACAACTTGATTTGCAGACGGGGAGAACCAGAGTTATCTCTGCACAGCATAACCTGGTCCCGGTGAA 

CCGCAATCTTCCTCACGCCTTAGTCCCCGAGTACAAGGAGAAGCAACCCGGCCCGGTCAAAAAATTCTTGAACCAGTTCA 
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AACACCACTCAGTACTTGTGGTATCAGAGGAAAAAATTGAAGCTCCCCGTAAGAGAATCGAATGGATCGCCCCGATTGGC 
ATAGCCGGTGCAGATAAGAACTACAACCTGGCTTTCGGGTTTCCGCCGCAGGCACGGTACGACCTGGTGTTCATCAACAT 
TGGAACTAAATACAGAAACCACCACTTTCAGCAGTGCGAAGACCATGCGGCGACCTTAAAAACCCTTTCGCGTTCGGCCC 
TGAATTGCCTTAACCCAGGAGGCACCCTCGTGGTGAAGTCCTATGGCTACGCCGACCGCAACAGTGAGGACGTAGTCACC 
GCTCTTGCCAGAAAGTTTGTCAGGGTGTCTGCAGCGAGACCAGATTGTGTCTCAAGCAATACAGAAATGTACCTGATTTT 
CCGACAACTAGACAACAGCCGTACACGGCAATTCACCCCGCACCATCTGAATTGCGTGATTTCGTCCGTGTATGAGGGTA 
CAAGAGATGGAGTTGGAGCCGCGCCGTCATACCGCACCAAAAGGGAGAATATTGCTGACTGTCAAGAGGAAGCAGTTGTC 
AACGCAGCCAATCCGCTGGGTAGACCAGGCGAAGGAGTCTGCCGTGCCATCTATAAACGTTGGCCGACCAGTTTTACCGA 
TTCAGCCACGGAGACAGGCACCGCAAGAATGACTGTGTGCCTAGGAAAGAAAGTGATCCACGCGGTCGGCCCTGATTTCC 
GGAAGCACCCAGAAGCAGAAGCCTTGAAATTGCTACAAAACGCCTACCATGCAGTGGCAGACTTAGTAAATGAACATAAC 
ATCAAGTCTGTCGCCATTCCACTGCTATCTACAGGCATTTACGCAGCCGGAAAAGACCGCCTTGAAGTATCACTTAACTG 
CTTGACAACCGCGCTAGACAGAACTGACGCGGACGTAACCATCTATTGCCTGGATAAGAAGTGGAAGGAAAGAATCGACG 
CGGCACTCCAACTTAAGGAGTCTGTAACAGAGCTGAAGGATGAAGATATGGAGATCGACGATGAGTTAGTATGGATtCAT 
CCAGACAGTTGCTTGAAGGGAAGAAAGGGATTCAGTACTACAAAAGGAAAATTGTATTCGTACTTCGAAGGCACCAAATT 
CCATCAAGCAGCAAAAGACATGGCGGAGATAAAGGTCCTGTTCCCTAATGACCAGGAAAGTAATGAACAACTGTGTGCCT 
ACATATTGGGTGAGACCATGGAAGCAATCCGCGAAAAGTGCCCGGTCGACCATAACCCGTCGTCTAGCCCGCCCAAAACG 
TTGCCGTGCCTTTGCATGTATGCCATGACGCCAGAAAGGGTCCACAGACTTAGAAGCAATAACGTCAAAGAAGTTACAGT 
ATGCTCCTCCACCCCCCTTCCTAAGCACAAAATTAAGAATGTTCAGAAGGTTCAGTGCACGAAAGTAGTCCTGTTTAATC 
CGCACACTCCCGCATTCGTTCCCGCCCGTAAGTACATAGAAGTGCCAGAACAGCCTACCGCTCCTCCTGCACAGGCCGAG 
GAGGCCCCCGAAGTTGTAGCGACACCGTCACCATCTACAGCTGATAACACCTCGCTTGATGTCACAGACATCTCACTGGA 
TATGGATGACAGTAGCGAAGGCTCACTTTTTTCGAGCTTTAGCGGATCGGACAACTCTATTACTAGTATGGACAGTTGGT 
CGTCAGGACCTAGTTCACTAGAGATAGTAGACCGAAGGCAGGTGGTGGTGGCTGACGTTCATGCCGTCCAAGAGCCTGCC 
CCTATTCCACCGCCAAGGCTAAAGAAGATGGCCCGCCTGGCAGCGGCAAGAAAAGAGCCCACTCCACCGGCAAGCAATAG 
CTCTGAGTCCCTCCACCTCTCTTTTGGTGGGGTATCCATGTCCCTCGGATCAATTTTCGACGGAGAGACGGCCCGCCAGG 
CAGCGGTACAACCCCTGGCAACAGGCCCCACGGATGTGCCTATGTCTTTCGGATCGTTTTCCGACGGAGAGATTGATGAG 
CTGAGCCGCAGAGTAACTGAGTCCGAACCCGTCCTGTTTGGATCATTTGAACCGGGCGAAGTGAACTCAATTATATCGTC 
CCGATCAGCCGTATCTTTTCCACTACGCAAGCAGAGACGTAGACGCAGGAGCAGGAGGACTGAATACTGACTAACCGGGG 
TAGGTGGGTACATATTTTCGACGGACACAGGCCCTGGGCACTTGCAAAAGAAGTCCGTTCTGCAGAACCAGCTTACAGAA 
CCGACCTTGGAGCGCAATGTCCTGGAAAGAATTCATGCCCCGGTGCTCGACACGTCGAAAGAGGAACAACTCAAACTCAG 
GTACCAGATGATGCCCACCGAAGCCAACAAAAGTAGGTACCAGTCTCGTAAAGTAGAAAATCAGAAAGCCATAACCACTG 
AGCGACTACTGTCAGGACTACGACTGTATAACTCTGCCACAGATCAGCCAGAATGCTATAAGATCACCTATCCGAAACCA 
TTGTACTCCAGTAGCGTACCGGCGAACTACTCCGATCCACAGTTCGCTGTAGCTGTCTGTAACAACTATCTGCATGAGAA 
CTATCCGACAGTAGCATCTTATCAGATTACTGACGAGTACGATGCTTACTTGGATATGGTAGACGGGACAGTCGCCTGCC 
TGGATACTGCAACCTTCTGCCCCGCTAAGCTTAGAAGTTACCCGAAAAAACATGAGTATAGAGCCCCGAATATCCGCAGT 
GCGGTTCCATCAGCGATGCAGAACACGCTACAAAATGTGCTCATTGCCGCAACTAAAAGAAATTGCAACGTCACGCAGAT 
GCGTG AACTGCCAACACTGGACTCAGCGACAT TCAATGTCGAATGCTT TCGAAAATATGCATGTAATGACGAG TAT TGGG 
AGGAGTTCGCTCGGAAGCCAATTAGGATTACCACTGAGTTTGTCACCGCATATGTAGCTAGACTGAAAGGCCCTAAGGCC 
GCCGCACTATTTGCAAAGACGTATAATTTGGTCCCATTGCAAGAAGTGCCTATGGATAGATTCGTCATGGACATGAAAAG 
AGACGTGAAAGTTACACCAGGCACGAAACACACAGAAGAAAGACCGAAAGTACAAGTGATACAAGCCGCAGAACCCCTGG 
CGACTGCTTACTTATGCGGGATTCACCGGGAATTAGTGCGTAGGCTTACGGCCGTCTTGCTTCCAAACATTCACACGCTT 
TTTGACATGTCGGCGGAGGATTTTGATGCAATCATAGCAGAACACTTCAAGCAAGGCGACCCGGTACTGGAGACGGATAT 
CGCATCATTCGACAAAAGCCAAGACGACGCTATGGCGTTAACCGGTCTGATGATCTTGGAGGACCTGGGTGTGGATCAAC 
CACTACTCGACTTGATCGAGTGCGCCTTTGGAGAAATATCATCCACCCATCTACCTACGGGTACTCGTTTTAAATTCGGG 
GCGATGATGAAATCCGGAATGTTCCTCACACTTTTTGTCAACACAGTTTTGAATGTCGTTATCGCCAGCAGAGTACTAGA 
AGAGCGGCTTAAAACGTCCAGATGTGCAGCGTTCATTGGCGACGACAACATCATACATGGAGTAGTATCTGACAAAGAAA 
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TGGCTGAGAGGTGCGCCACCTGGCTCAACATGGAGGTTAAGATCATCGACGCAGTCATCGGTGAGAGACCACCTTACTTC 
TGCGGCGGATTTATCTTGCAAGATTCGGTTACTTCCACAGCGTGCCGCGTGGCGGATCCCCTGAAAAGGCTGTTTAAGTT 
GGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCTCTGCTAGATGAAACAAAGGCGTGGTTTA 
GAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAGGTAGACAATATTACACCTGTCCTACTGGCA 
TTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAATAAAGCATCTCTACGGTGGTCCTAAATA 
GTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCtctagaCGCGTAGAtctcocgtgogcotgcaggc 
cttgggCCCAATGATCCGACCAGCAAAACTCGATGTACTTCCGAGGAACTGATGTGCATAATGCATCAGGCTGGTACATT 
AGATCCCCGCTTACCGCGGGCAATATAGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGTgCATAATGCTGCG 
CaGTGTTGCCACATAACCACTATATTAACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCTGAGGAAGCGTGGT 
GCATAATGCCACGCAGCGTCTGCATAACTTTTATTATTTCTTTTATTAATCAACAAAATTTTGTTTTTAACATTTCAAAA 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGAATTCctcgattaottaagcggccgctcgoGGGGAATTAATTCT 
TGAAGACGAAAGGGCCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCA 
AATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACA 
TTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAA 
AAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTT 
CGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGG 
GCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTA 
CGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACA 
ACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACC 
GGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGtAGCAATGGCAACAACGTTGCGCAAACTAT 
TAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTT 
CTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGC 
AGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAA 
ATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAG 
ATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTA 
ACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCG 
TAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTT 
CCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAA 
GAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTC 
TTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCC 
AGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAGCGCCACGCTTCCCGAAGGGAG 
AAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGT 
ATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTA 
TGGAAAAACGCCAGCAACGCGAGCTCgtatggacotattgtcgttogoocgcggctacoattaotacotaoccttotgto 

tcatococotocgatttaggggacoctotag 
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ATTGACGGCGTAGTACACACTATTGAATCAAACAGCCGACCAATTGCACTACCATCACAATGGAGAAGCCAGTAGTAAAC 

GTAGACGTAGACCCCCAGAGTCCGTTTGTCGTGCAACTGCAAAAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGT 

CACTCCAAATGACCATGCTAATGCCAGAGCATTTTCGCATCTGGCCAGTAAACTAATCGAGCTGGAGGTTCCTACCACAG 

CGACGATCTTGGACATAGGCAGCGCACCGGCTCGTAGAATGTTTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGT 

AGTCCAGAAGACCCGGACCGCATGATGAAATACGCCAGTAAACTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTT 

GCATGAGAAGATTAAGGATCTCCGGACCGTACTTGATACGCCGGATGCTGAAACACCATCGCTCTGCTTTCACAACGATG 

TTACCTGCAACATGCGTGCCGAATATTCCGTCATGCAGGACGTGTATATCAACGCTCCCGGAACTATCTATCATCAGGCT 

ATGAAAGGCGTGCGGACCCTGTACTGGATTGGCTTCGACACCACCCAGTTCATGTTCTCGGCTATGGCAGGTTCGTACCC 

TGCGTACAACACCAACTGGGCCGACGAGAAAGTCCTTGAAGCGCGTAACATCGGACTTTGCAGCACAAAGCTGAGTGAAG 

GTAGGACAGGAAAATTGTCGATAATGAGGAAGAAGGAGTTGAAGCCCGGGTCGCGGGTTTATTTCTCCGTAGGATCGACA 

CTTTATCCAGAACACAGAGCCAGCTTGCAGAGCTGGCATCTTCCATCGGTGTTCCACTTGAATGGAAAGCAGTCGTACAC 

TTGCCGCTGTGATACAGTGGTGAGTTGCGAAGGCTACGTAGTGAAGAAAATCACCATCAGTCCCGGGATCACGGGAGAAA 

CCGTGGGATACGCGGTTACACACAATAGCGAGGGCTTCTTGCTATGCAAAGTTACTGACACAGTAAAAGGAGAACGGGTA 

TCGTTCCCTGTGTGCACGTACATCCCGGCCACCATATGCGATCAGATGACTGGTATAATGGCCACGGATATATCACCTGA 

CGATGCACAAAAACTTCTGGTTGGGCTCAACCAGCGAATTGTCATTAACGGTAGGACTAACAGGAACACCAACACCATGC 

AAAATTACCTTCTGCCGATCATAGCACAAGGGTTCAGCAAATGGGCTAAGGAGCGCAAGGATGATCTTGATAACGAGAAA 

ATGCTGGGTACTAGAGAACGCAAGCTTACGTATGGCTGCTTGTGGGCGTTTCGCACTAAGAAAGTACATTCGTTTTATCG 

CCCACCTGGAACGCAGACCTGCGTAAAAGTCCCAGCCTCTTTTAGCGCTTTTCCCATGTCGTCCGTATGGACGACCTCTT 

TGCCCATGTCGCTGAGGCAGAAATTGAAACTGGCATTGCAACCAAAGAAGGAGGAAAAACTGCTGCAGGTCTCGGAGGAA 

TTAGTCATGGAGGCCAAGGCTGCTTTTGAGGATGCTCAGGAGGAAGCCAGAGCGGAGAAGCTCCGAGAAGCACTTCCACC 

ATTAGTGGCAGACAAAGGCATCGAGGCAGCCGCAGAAGTTGTCTGCGAAGTGGAGGGGCTCCAGGCGGACATCGGAGCAG 

CATTAGTTGAAACCCCGCGCGGTCACGTAAGGATAATACCTCAAGCAAATGACCGTATGATCGGACAGTATATCGTTGTC 

TCGCCAMCTCTGTGCTGAAGAATGCCAAACTCGCACCAGCGCACCCGCTAGCAGATCAGGTTAAGATCATAACACACTC 

CGGAAGATCAGGAAGGTACGCGGTCGAACCATACGACGCTAAAGTACTGATGCCAGCAGGAGGTGCCGTACCATGGCCAG 

AATTCCTAGCACTGAGTGAGAGCGCCACGTTAGTGTACAACGAAAGAGAGTTTGTGAACCGCAAACTATACCACATTGCC 

ATGCATGGCCCCGCCAAGAATACAGAAGAGGAGCAGTACAAGGTTACAAAGGCAGAGCTTGCAGAAACAGAGTACGTGTT 

TGACGTGGACAAGAAGCGTTGCGTTAAGAAGGAAGAAGCCTCAGGTCTGGTCCTCTCGGGAGAACTGACCAACCCTCCCT 

ATCATGAGCTAGCTCTGGAGGGACTGAAGACCCGACCTGCGGTCCCGTACAAGGTCGAAACAATAGGAGTGATAGGCACA 

CCGGGGTCGGGCAAGTCAGCTATTATCAAGTCAACTGTCACGGCACGAGATCTTGTTACCAGCGGAAAGAAAGAAAATTG 

TCGCGAAATTGAGGCCGACGTGCTAAGACTGAGGGGTATGCAGATTACGTCGAAGACAGTAGATTCGGTTATGCTCAACG 

GATGCCACAAAGCCGTAGAAGTGCTGTACGTTGACGAAGCGTTCGCGTGCCACGCAGGAGCACTACTTGCCTTGATTGCT 

ATCGTCAGGCCCCGCAAGAAGGTAGTACTATGCGGAGACCCCATGCAATGCGGATTCTTCAACATGATGCAACTAAAGGT 

ACATTTCAATCACCCTGAAAAAGACATATGCACCAAGACATTCTACAAGTATATCTCCCGGCGTTGCACACAGCCAGTTA 

CAGCTATTGTATCGACACTGCATTACGATGGAAAGATGAAAACCACGAACCCGTGCAAGAAGAACATTGAAATCGATATT 

ACAGGGGCCACAAAGCCGAAGCCAGGGGATATCATCCTGACATGTTTCCGCGGGTGGGTTAAGCAATTGCAAATCGACTA 

TCCCGGACATGAAGTAATGAGAGCCGCGGCCTCACAAGGGCTAACCAGAAAAGGAGTGTATGCCGTCCGGCAAAAAGTCA 

ATGAAAACCCACTGTACGCGATCACATCAGAGCATGTGAACGTGTTGCTCACCCGCACTGAGGACAGGCTAGTGTGGAAA 

ACCTTGCAGGGCGACCCATGGATTAAGCAGCCCACTAACATACCTAAAGGAMCTTTCAGGCTACTATAGAGGACTGGGA 

AGCTGAACACAAGGGAATAATTGCTGCAATAAACAGCCCCACTCCCCGTGCCAATCCGTTCAGCTGCAAGACCAACGTTT 

GCTGGGCGAAAGCATTGGAACCGATACTAGCCACGGCCGGTATCGTACTTACCGGTTGCCAGTGGAGCGAACTGTTCCCA 

CAGTTTGCGGATGACAAACCACATTCGGCCATTTACGCCTTAGACGTAATTTGCATTAAGTTTTTCGGCATGGACTTGAC 

AAGCGGACTGTTTTCTAMCAGAGCATCCCACTAACGTACCATCCCGCCGATTCAGCGAGGCCGGTAGCTCATTGGGACA 

ACAGCCCAGGAACCCGCAAGTATGGGTACGATCACGCCATTGCCGCCGAACTCTCCCGTAGATTTCCGGTGTTCCAGCTA 

GCTGGGAAGGGCACACAACTTGATTTGCAGACGGGGAGAACCAGAGTTATCTCTGCACAGCATAACCTGGTCCCGGTGAA 

CCGCAATCTTCCTCACGCCTTAGTCCCCGAGTACAAGGAGAAGCAACCCGGCCCGGTCAAAAAATTCTTGAACCAGTTCA 
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AACACCACTCAGTACTTGTGGTATCAGAGGAAAAAATTGAAGCTCCCCGTAAGAGAATCGAATGGATCGCCCCGATTGGC 

ATAGCCGGTGCAGATAAGAACTACAACCTGGCTTTCGGGTTTCCGCCGCAGGCACGGTACGACCTGGTGTTCATCAACAT 

TGGAACTAAATACAGAAACCACCACTTTCAGCAGTGCGAAGACCATGCGGCGACCTTAAAAACCCTTTCGCGTTCGGCCC 

TGAATTGCCTTAACCCAGGAGGCACCCTCGTGGTGAAGTCCTATGGCTACGCCGACCGCAACAGTGAGGACGTAGTCACC 

GCTCTTGCCAGAAAGTTTGTCAGGGTGTCTGCAGCGAGACCAGATTGTGTCTCAAGCAATACAGAAATGTACCTGATTTT 

CCGACAACTAGACAACAGCCGTACACGGCAATTCACCCCGCACCATCTGAATTGCGTGATTTCGTCCGTGTATGAGGGTA 

CAAGAGATGGAGTTGGAGCCGCGCCGTCATACCGCACCAAAAGGGAGAATATTGCTGACTGTCAAGAGGAAGCAGTTGTC 

AACGCAGCCAATCCGCTGGGTAGACCAGGCGAAGGAGTCTGCCGTGCCATCTATAAACGTTGGCCGACCAGTTTTACCGA 

TTCAGCCACGGAGACAGGCACCGCAAGAATGACTGTGTGCCTAGGAAAGAAAGTGATCCACGCGGTCGGCCCTGA7TTCC 

GGAAGCACCCAGAAGCAGAAGCCTTGAAATTGCTACAAAACGCCTACCATGCAGTGGCAGACTTAGTAAATGAACATAAC 

ATCAAGTCTGTCGCCATTCCACTGCTATCTACAGGCATTTACGCAGCCGGAAAAGACCGCCTTGAAGTATCACTTAACTG 

CTTGACAACCGCGCTAGACAGAACTGACGCGGACGTAACCATCTATTGCCTGGATAAGAAGTGGAAGGAAAGAATCGACG 

CGGCACTCCAACTTAAGGAGTCTGTAACAGAGCTGAAGGATGAAGATATGGAGATCGACGATGAGTTAGTATGGATICAT 

CCAGACAGTTGCTTGAAGGGAAGAAAGGGATTCAGTACTACAAAAGGAAAATTGTATTCGTACTTCGAAGGCACCAAATT 

CCATCAAGCAGCAAAAGACATGGCGGAGATAAAGGTCCTGTTCCCTAATGACCAGGAAAGTAATGAACAACTGTGTGCCT 

ACATATTGGGTGAGACCATGGAAGCAATCCGCGAAAAGTGCCCGGTCGACCATAACCCGTCGTCTAGCCCGCCCAAAACG 

TTGCCGTGCCTTTGCATGTATGCCATGACGCCAGAAAGGGTCCACAGACTTAGAAGCAATAACGTCAAAGAAGTTACAGT 

ATGCTCCTCCACCCCCCTTCCTAAGCACAAAATTAAGAATGTTCAGAAGGTTCAGTGCACGAAAGTAGTCCTGTTTAATC 

CGCACACTCCCGCATTCGTTCCCGCCCGTAAGTACATAGAAGTGCCAGAACAGCCTACCGCTCCTCCTGCACAGGCCGAG 

GAGGCCCCCGAAGTTGTAGCGACACCGTCACCATCTACAGCTGATAACACCTCGCTTGATGTCACAGACATCTCACTGGA 

TATGGATGACAGTAGCGAAGGCTCACTTTTTTCGAGCTTTAGCGGATCGGACAACTCTATTACTAGTATGGACAGTTGGT 

CGTCAGGACCTAGTTCACTAGAGATAGTAGACCGAAGGCAGGTGGTGGTGGCTGACGTTCATGCCGTCCAAGAGCCTGCC 

CCTATTCCACCGCCAAGGCTAAAGAAGATGGCCCGCCTGGCAGCGGCAAGAAAAGAGCCCACTCCACCGGCAAGCAATAG 

CTCTGAGTCCCTCCACCTCTCTTTTGGTGGGGTATCCATGTCCCTCGGATCAATTTTCGACGGAGAGACGGCCCGCCAGG 

CAGCGGTACAACCCCTGGCAACAGGCCCCACGGATGTGCCTATGTCTTTCGGATCGTTTTCCGACGGAGAGATTGATGAG 

CTGAGCCGCAGAGTAACTGAGTCCGAACCCGTCCTGTTTGGATCATTTGAACCGGGCGAAGTGAACTCAATTATATCGTC 

CCGATCAGCCGTATCTTTTCCACTACGCAAGCAGAGACGTAGACGCAGGAGCAGGAGGACTGAATACTGACTAACCGGGG 

TAGGTGGGTACATATTTTCGACGGACACAGGCCCTGGGCACTTGCAAAAGAAGTCCGTTCTGCAGAACCAGCTTACAGAA 

CCGACCTTGGAGCGCAATGTCCTGGAAAGAATTCATGCCCCGGTGCTCGACACGTCGAMGAGGAACAACTCAAACTCAG 

GTACCAGATGATGCCCACCGAAGCCAACAAAAGTAGGTACCAGTCTCGTAAAGTAGAAAATCAGAAAGCCATAACCACTG 

AGCGACTACTGTCAGGACTACGACTGTATAACTCTGCCACAGATCAGCCAGAATGCTATAAGATCACCTATCCGAAACCA 

TTGTACTCCAGTAGCGTACCGGCGAACTACTCCGATCCACAGTTCGCTGTAGCTGTCTGTAACAACTATCTGCATGAGAA 

CTATCCGACAGTAGCATCTTATCAGATTACTGACGAGTACGATGCTTACTTGGATATGGTAGACGGGACAGTCGCCTGCC 

TGGATACTGCAACCTTCTGCCCCGCTAAGCTTAGAAGTTACCCGAAAAAACATGAGTATAGAGCCCCGAATATCCGCAGT 

GCGGTTCCATCAGCGATGCAGAACACGCTACAAAATGTGCTCATTGCCGCAACTAAAAGAAATTGCAACGTCACGCAGAT 

GCGTGAACTGCCAACACTGGACTCAGCGACATTCAATGTCGAATGCTTTCGAAAATATGCATGTMTGACGAGTATTGGG 

AGGAGTTCGCTCGGAAGCCAATTAGGATTACCACTGAGTTTGTCACCGCATATGTAGCTAGACTGAAAGGCCCTAAGGCC 

GCCGCACTATTTGCAAAGACGTATAATTTGGTCCCATTGCAAGAAGTGCCTATGGATAGATTCGTCATGGACATGAAAAG 

AGACGTGAAAGTTACACCAGGCACGAAACACACAGAAGAAAGACCGAAAGTACAAGTGATACAAGCCGCAGAACCCCTGG 

CGACTGCTTACTTATGCGGGATTCACCGGGAATTAGTGCGTAGGCTTACGGCCGTCTTGCTTCCAAACATTCACACGCTT 

TTTGACATGTCGGCGGAGGATTTTGATGCAATCATAGCAGAACACTTCAAGCAAGGCGACCCGGTACTGGAGACGGATAT 

CGCATCATTCGACAAAAGCCAAGACGACGCTATGGCGTTAACCGGTCTGATGATCTTGGAGGACCTGGGTGTGGATCAAC 

CACTACTCGACTTGATCGAGTGCGCCTTTGGAGAAATATCATCCACCCATCTACCTACGGGTACTCGTTTTAAATTCGGG 

GCGATGATGAAATCCGGAATGTTCCTCACACTTTTTGTCAACACAGTTTTGAATGTCGTTATCGCCAGCAGAGTACTAGA 

AGAGCGGCTTAAAACGTCCAGATGTGCAGCGTTCATTGGCGACGACAACATCATACATGGAGTAGTATCTGACAAAGAAA 
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TGGCTGAGAGGTGCGCCACCTGGCTCAACATGGAGGTTAAGATCATCGACGCAGTCATCGGTGAGAGACCACCTTACTTC 

TGCGGCGGATTTATCTTGCAAGATTCGGTTACTTCCACAGCGTGCCGCGTGGCGGATCCCCTGAAAAGGCTGTTTAAGTT 

GGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCTCTGCTAGATGAAACAAAGGCGTGGTTTA 

GAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAGGTAGACAATATTACACCTGTCCTACTGGCA 

TTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAATAAAGCATCTCTACGGTGGTCCTAAATA 

GTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCtctagoggcgcggagatgggggtgcocgaotgtc 

ctgcctggctgtggcttctcctgtccctgctgtcgctccctctgggcctcccagtcctgggcgccccaccacgcctcotc 

tgtgacogccgagtcctggQgaggtacctcttggaggccoaggaggccgagoatotcocgacgggctgtgctgoQcoctg 

cogcttgaotgogaototcoctgtcccagacaccaaogttootttctatgcctggoagoggotggoggtcgggcogcagg 

ccgtogaagtctggcagggcctggccctgctgtcggoagctgtcctgcggggccoggccctgttggtcooctcttcccog 

ccgtgggagcccctgcagctgcatgtggatooagccgtcogtggccttcgcagcctcoccQctctgcttcgggctctggg 

agcccagoaggoogccatctcccctccogotgcggcctcogctgctccactccgoocootcoctgctgacact.ttccgco 

oactcttccgogtctactccaotltcctccggggaaagctgoagctgtacacoggggaggcctgcoggacoggggocogo 

tgagcotgcaggcctt gggCCCAATGATCCGACCAGCAAAACTCG ATG TACTTCCGAGGAACTGATG TGCATAATGCATC 

AGGCTGGTACATTAGATCCCCGCTTACCGCGGGCAATATAGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGT 

gCATAATGCTGCGCaGTGTTGCCACATAACCACTATATTAACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCT 

GAGGAAGCGTGGTGCATAATGCCACGCAGCGTCTGCATAACTTTTATTATTTCTTTTATTAATCAACAAAATTTTGTTTT 

TAACATTTCAAAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGAATTCctcgattQattoogcggccgctcgaG 

GGGAATTAATTCTTGAAGACGAAAGGGCCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTT 

CTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTA 

TGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAMCG 

CTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGAT 

CCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCC 

GTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACA 

GAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAA 

CTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTG 

ATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGtAGCAATGGCAACAACG 

TTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGT 

TGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTC 

GCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACT 

ATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTC 

ATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGA 

CCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCT 

TTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCT 

ACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAG 

GCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGC 

GATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTC 

GTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAGCGCCACGC 

TTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGG 

GGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGG 

GGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGAGCTCgtatggocatattgtcgttagoocgcggctocaattoatac 

ataoccttatgtotcotacocotacgatttaggggocactotag 
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ATTGACGGCGTAGTACACACTATTGAATCAAACAGCCGACCAATTGCACTACCATCACAATGGAGAAGCCAGTAGTAAAC 

GTAGACGTAGACCCCCAGAGTCCGTTTGTCGTGCAACTGCAAAAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGT 

CACTCCAAATGACCATGCTAATGCCAGAGCATTTTCGCATCTGGCCAGTAAACTAATCGAGCTGGAGGTTCCTACCACAG 

CGACGATCTTGGACATAGGCAGCGCACCGGCTCGTAGAATGTTTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGT 

AGTCCAGAAGACCCGGACCGCATGATGAAATACGCCAGTAAACTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTT 

GCATGAGAAGATTAAGGATCTCCGGACCGTACTTGATACGCCGGATGCTGAAACACCATCGCTCTGCTTTCACAACGATG 

TTACCTGCAACATGCGTGCCGAATATTCCGTCATGCAGGACGTGTATATCAACGCTCCCGGAACTATCTATCATCAGGCT 

ATGAAAGGCGTGCGGACCCTGTACTGGATTGGCTTCGACACCACCCAGTTCATGTTCTCGGCTATGGCAGGTTCGTACCC 

TGCGTACAACACCAACTGGGCCGACGAGAAAGTCCTTGAAGCGCGTAACATCGGACTTTGCAGCACAAAGCTGAGTGAAG 

GTAGGACAGGAAAATTGTCGATAATGAGGAAGAAGGAGTTGAAGCCCGGGTCGCGGGTTTATTTCTCCGTAGGATCGACA 

CTTTATCCAGAACACAGAGCCAGCTTGCAGAGCTGGCATCTTCCATCGGTGTTCCACTTGAATGGAAAGCAGTCGTACAC 

TTGCCGCTGTGATACAGTGGTGAGTTGCGAAGGCTACGTAGTGAAGAAAATCACCATCAGTCCCGGGATCACGGGAGAAA 

CCGTGGGATACGCGGTTACACACAATAGCGAGGGCTTCTTGCTATGCAAAGTTACTGACACAGTAAAAGGAGAACGGGTA 

TCGTTCCCTGTGTGCACGTACATCCCGGCCACCATATGCGATCAGATGACTGGTATAATGGCCACGGATATATCACCTGA 

CGATGCACAAAAACTTCTGGTTGGGCTCAACCAGCGAATTGTCATTAACGGTAGGACTMCAGGAACACCAACACCATGC 

AAAATTACCTTCTGCCGATCATAGCACAAGGGTTCAGCAAATGGGCTAAGGAGCGCAAGGATGATCTTGATAACGAGAAA 

ATGCTGGGTACTAGAGAACGCAAGCTTACGTATGGCTGCTTGTGGGCGTTTCGCACTAAGAAAGTACATTCGTTTTATCG 

CCCACCTGGAACGCAGACCTGCGTAAAAGTCCCAGCCTCTTTTAGCGCTTTTCCCATGTCGTCCGTATGGACGACCTCTT 

TGCCCATGTCGCTGAGGCAGAAATTGAAACTGGCATTGCAACCAAAGAAGGAGGAAAAACTGCTGCAGGTCTCGGAGGAA 

TTAGTCATGGAGGCCAAGGCTGCTTTTGAGGATGCTCAGGAGGAAGCCAGAGCGGAGAAGCTCCGAGAAGCACTTCCACC 

ATTAGTGGCAGACAAAGGCATCGAGGCAGCCGCAGAAGTTGTCTGCGAAGTGGAGGGGCTCCAGGCGGACATCGGAGCAG 

CATTAGTTGAAACCCCGCGCGGTCACGTAAGGATAATACCTCAAGCAAATGACCGTATGATCGGACAGTATATCGTTGTC 

TCGCCAAACTCTGTGCTGAAGAATGCCAAACTCGCACCAGCGCACCCGCTAGCAGATCAGGTTAAGATCATAACACACTC 

CGGAAGATCAGGAAGGTACGCGGTCGAACCATACGACGCTAAAGTACTGATGCCAGCAGGAGGTGCCGTACCATGGCCAG 

AATTCCTAGCACTGAGTGAGAGCGCCACGTTAGTGTACAACGAAAGAGAGTTTGTGAACCGGAAACTATACCACATTGCC 

ATGCATGGCCCCGCCAAGAAT ACAG AAGAGGAGCAGTACAAGGT TACAAAGGCAGAGCT TGCAGAAACAG AGTACG TGT T 

TGACGTGGACAAGAAGCGTTGCGTTAAGAAGGAAGAAGCCTCAGGTCTGGTCCTCTCGGGAGAACTGACCAACCCTCCCT 

ATCATGAGCTAGCTCTGGAGGGACTGAAGACCCGACCTGCGGTCCCGTACAAGGTCGAAACAATAGGAGTGATAGGCACA 

CCGGGGTCGGGCAAGTCAGCTATTATCAAGTCAACTGTCACGGCACGAGATCTTGTTACCAGCGGAAAGAAAGAAAATTG 

TCGCGAAATTGAGGCCGACGTGCTAAGACTGAGGGGTATGCAGATTACGTCGAAGACAGTAGATTCGGTTATGCTCAACG 

GATGCCACAAAGCCGTAGAAGTGCTGTACGTTGACGAAGCGTTCGCGTGCCACGCAGGAGCACTACTTGCCTTGATTGCT 

ATCGTCAGGCCCCGCAAGAAGGTAGTACTATGCGGAGACCCCATGCAATGCGGATTCTTCAACATGATGCAACTAAAGGT 

ACATTTCAATCACCCTGAAAAAGACATATGCACCAAGACATTCTACAAGTATATCTCCCGGCGTTGCACACAGCCAGTTA 

CAGCTATTGTATCGACACTGCATTACGATGGAAAGATGAAAACCACGAACCCGTGCAAGAAGAACATTGAAATCGATATT 

ACAGGGGCCACAAAGCCGAAGCCAGGGGATATCATCCTGACATGTTTCCGCGGGTGGGTTAAGCAATTGCAAATCGACTA 

TCCCGGACATGAAGTAATGACAGCCGCGGCCTCACAAGGGCTAACCAGAAAAGGAGTGTATGCCGTCCGGCAAAAAGTCA 

ATGAAAACCCACTGTACGCGATCACATCAGAGCATGTGAACGTGTTGCTCACCCGCACTGAGGACAGGCTAGTGTGGAAA 

ACCTTGCAGGGCGACCCATGGATTAAGCAGCCCACTAACATACCTAAAGGAAACTTTCAGGCTACTATAGAGGACTGGGA 

AGCTGAACACAAGGGAATAATTGCTGCAATAAACAGCCCCACTCCCCGTGCCAATCCGTTCAGCTGCAAGACCAACGTTT 

GCTGGGCGAAAGCATTGGAACCGATACTAGCCACGGCCGGTATCGTACTTACCGGTTGCCAGTGGAGCGAACTGTTCCCA 

CAGTTTGCGGATGACAAACCACATTCGGCCATTTACGCCTTAGACGTAATTTGCATTAAGTTTTTCGGCATGGACTTGAC 

AAGCGGACTGTTTTCTAAACAGAGCATCCCACTAACGTACCATCCCGCCGATTCAGCGAGGCCGGTAGCTCATTGGGACA 

ACAGCCCAGGAACCCGCAAGTATGGGTACGATCACGCCATTGCCGCCGAACTCTCCCGTAGATTTCCGGTGTTCCAGCTA 

GCTGGGAAGGGCACACAACTTGATTTGCAGACGGGGAGAACCAGAGTTATCTCTGCACAGCATAACCTGGTCCCGGTGAA 

CCGCAATCTTCCTCACGCCTTAGTCCCCGAGTACAAGGAGAAGCAACCCGGCCCGGTCAAAAAATTCTTGAACCAGTTCA 
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AACACCACTCAGTACTTGTGGTATCAGAGGAAAAAATTGAAGCTCCCCGTAAGAGAATCGAATGGATCGCCCCGATTGGC 

ATAGCCGGTGCAGATAAGAACTACAACCTGGCTTTCGGGTTTCCGCCGCAGGCACGGTACGACCTGGTGTTCATCAACAT 

TGGAACTAAATACAGAAACCACCACTTTCAGCAGTGCGAAGACCATGCGGCGACCTTAAAAACCCTTTCGCGTTCGGCCC 

TGAATTGCCTTAACCCAGGAGGCACCCTCGTGGTGAAGTCC.TATGGCTACGCCGACCGCAACAGTGAGGACGTAGTCACC 

GCTCTTGCCAGAAAGTTTGTCAGGGTGTCTGCAGCGAGACCAGATTGTGTCTCAAGCAATACAGAAATGTACCTGATTTT 

CCGACAACTAGACAACAGCCGTACACGGCAATTCACCCCGCACCATCTGAATTGCGTGATTTCGTCCGTGTATGAGGGTA 

CAAGAGATGGAGTTGGAGCCGCGCCGTCATACCGCACCAAAAGGGAGAATATTGCTGACTGTCAAGAGGAAGCAGTTGTC 

AACGCAGCCAATCCGCTGGGTAGACCAGGCGAAGGAGTCTGCCGTGCCATCTATAAACGTTGGCCGACCAGTTTTACCGA 

TTCAGCCACGGAGACAGGCACCGCAAGAATGACTGTGTGCCTAGGAAAGAAAGTGATCCACGCGGTCGGCCCTGATTTCC 

GGAAGCACCCAGAAGCAGAAGCCTTGAAATTGCTACAAAACGCCTACCATGCAGTGGCAGACTTAGTAAATGAACATAAC 

ATCAAGTCTGTCGCCATTCCACTGCTATCTACAGGCATTTACGCAGCCGGAAAAGACCGCCTTGAAGTATCACTTAACTG 

CTTGACAACCGCGCTAGACAGAACTGACGCGGACGTAACCATCTATTGCCTGGATAAGAAGTGGAAGGAAAGAATCGACG 

CGGCACTCCAACTTAAGGAGTCTGTAACAGAGCTGAAGGATGAAGATATGGAGATCGACGATGAGTTAGTATGGATtCAT 

CCAGACAGTTGCTTGAAGGGAAGAAAGGGATTCAGTACTACAAAAGGAAAATTGTATTCGTACTTCGAAGGCACCAAATT 

CCATCAAGCAGCAAAAGACATGGCGGAGATAAAGGTCCTGTTCCCTAATGACCAGGAAAGTAATGAACAACTGTGTGCCT 

ACATATTGGGTGAGACCATGGAAGCAATCCGCGAAAAGTGCCCGGTCGACCATAACCCGTCGTCTAGCCCGCCCAAAACG 

TTGCCGTGCCTTTGCATGTATGCCATGACGCCAGAAAGGGTCCACAGACTTAGAAGCAATAACGTCAAAGAAGTTACAGT 

ATGCTCCTCCACCCCCCTTCC.TAAGCACAAAATTAAGAATGTTCAGAAGGTTCAGTGCACGAAAGTAGTCCTGTTTAATC 

CGCACACTCCCGCATTCGTTCCCGCCCGTAAGTACATAGAAGTGCCAGAACAGCCTACCGCTCCTCCTGCACAGGCCGAG 

GAGGCCCCCGAAGTTGTAGCGACACCGTCACCATCTACAGCTGATAACACCTCGCTTGATGTCACAGACATCTCACTGGA 

TATGGATGACAGTAGCGAAGGCTCACTTTTTTCGAGCTTTAGCGGATCGGACAACTCTATTACTAGTATGGACAGTTGGT 

CGTCAGGACCTAGTTCACTAGAGATAGTAGACCGAAGGCAGGTGGTGGTGGCTGACGTTCATGCCGTCCAAGAGCCTGCC 

CCTATTCCACCGCCAAGGGTAAAGAAGATGGCCCGCCTGGCAGCGGCAAGAAAAGAGCCCACTCCACCGGCAAGCAATAG 

CTCTGAGTCCCTCCACCTCTCTTTTGGTGGGGTATCCATGTCCCTCGGATCAATTTTCGACGGAGAGACGGCCCGCCAGG 

CAGCGGTACAACCCCTGGCAACAGGCCCCACGGATGTGCCTATGTCTTTCGGATCGTTTTCCGACGGAGAGATTGATGAG 

CTGAGCCGCAGAGTAACTGAGTCCGAACCCGTCCTGTTTGGATCATTTGAACCGGGCGAAGTGAACTCAATTATATCGTC 

CCGATCAGCCGTATCTTTTCCACTACGCAAGCAGAGACGTAGACGCAGGAGCAGGAGGACTGAATACTGACTAACCGGGG 

TAGGTGGGTACATATTTTCGACGGACACAGGCCCTGGGCACTTGCAAAAGAAGTCCGTTCTGCAGAACCAGCTTACAGAA 

CCGACCTTGGAGCGCAATGTCCTGGAAAGAATTCATGCCCCGGTGCTCGACACGTCGAAAGAGGAACAACTCAAACTCAG 

GTACCAGATGATGCCCACCGAAGCCAACAAAAGTAGGTACCAGTCTCGTAAAGTAGAAAATCAGAAAGCCATAACCACTG 

AGCGACTACTGTCAGGACTACGACTGTATAACTCTGCCACAGATCAGCCAGAATGCTATAAGATCACCTATCCGAAACCA 

TTGTACTCCAGTAGCGTACCGGCGAACTACTCCGATCCACAGTTCGCTGTAGCTGTCTGTAACAACTATCTGCATGAGAA 

CTATCCGACAGTAGCATCTTATCAGATTACTGACGAGTACGATGCTTACTTGGATATGGTAGACGGGACAGTCGCCTGCC 

TGGATACTGCAACCTTCTGCCCCGCTAAGCTTAGAAGTTACCCGAAAAAACATGAGTATAGAGCCCCGAATATCCGCAGT 

GCGGTTCCATCAGCGATGCAGAACACGCTACAAAATGTGCTCATTGCCGCAACTAAAAGAAATTGCAACGTCACGCAGAT 

GCGTGAACTGCCAACACTGGACTCAGCGACATTCAATGTCGAATGCTTTCGAAAATATGCATGTAATGACGAGTATTGGG 

AGGAGTTCGCTCGGAAGCCAATTAGGATTACCACTGAGTTTGTCACCGCATATGTAGCTAGACTGAAAGGCCCTAAGGCC 

GCCGCACTATTTGCAAAGACGTATAATTTGGTCCCATTGCAAGAAGTGCCTATGGATAGATTCGTCATGGACATGAAAAG 

AGACGTGAMGTTACACCAGGCACGAAACACACAGAAGAAAGACCGAAAGTACAAGTGATACAAGCCGCAGAACCCCTGG 

CGACTGCTTACTTATGCGGGATTCACCGGGAATTAGTGCGTAGGCTTACGGCCGTCTTGCTTCCAAACATTCACACGCTT 

TTTGACATGTCGGCGGAGGATTTTGATGCAATCATAGCAGAACACTTCAAGCAAGGCGACCCGGTACTGGAGACGGATAT 

CGCATCATTCGACAAAAGCCAAGACGACGCTATGGCGTTAACCGGTCTGATGATCTTGGAGGACCTGGGTGTGGATCAAC 

CACTACTCGACTTGATCGAGTGCGCCTTTGGAGAAATATCATCCACCCATCTACCTACGGGTACTCGTTTTAAATTCGGG 

GCGATGATGAAATCCGGAATGTTCCTCACACTTTTTGTCAACACAGTTTTGAATGTCGTTATCGCCAGCAGAGTACTAGA 

AGAGCGGCTTAAAACGTCCAGATGTGCAGCGTTCATTGGCGACGACAACATCATACATGGAGTAGTATCTGACAAAGAAA 
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TGGCTGAGAGGTGCGCCACCTGGCTCAACATGGAGGTTAAGATCATCGACGCAGTCATCGGTGAGAGACCACCTTACTTC 

TGCGGCGGATTTATCTTGCAAGATTCGGTTACTTCCACAGCGTGCCGCGTGGCGGATCCCCTGAAAAGGCTGTTTAAGTT 

GGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCTCTGCTAGATGAAACAAAGGCGTGGTTTA 

GAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAGGTAGACAATATTACACCTGTCCTACTGGCA 

TTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAATAAAGCATCTCTACGGTGGTCCTAAATA 

GTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCtctagotcogcccggccgggctccgaggcgagag 

gctgcatggagtggccggcgcggctctgcgggctgtgggcgctgctgctctgcgccggcggcgggggcgggggcgggggc 

gccgcgcctacggaaoctcagccocctgtgocaoatttgogtgtctctgttgaoaacctctgcacagtaatatggQcatg 

gootccocccgogggogccogctcoaottgtagtctatggtattttQgtcQttttggcgocoaacaGgataagaoaatag 

ctccggaaactcgtcgttcaatagaagtacccctgaatgagaggatttgtctgcaagtggggtcccagtgtagcaccaat 

gagagtgagaagcctogcattttggttgaaaaatgcatctcacccccagaaggtgatcctgagtctgctgtgoctgagct 

tcoatgcatttggcocoacctgagctacatgaagtgttcttggctccctggooggaatoccogtcccgococtaoctato 

ctctctactattggcacagaagcctggaaaoaQttcQtcaatgtgaQaocatctttagogaaggccaatactttggttgt 

tcctttgQtctgQccaoagtgaaggottccogttttgoocaacocagtgtccoaotaatggtcaaggataatgcaggaao 

oottaoaccatccttcoatatagtgcctttoacttcccgtgtgaaocctgctcctccocotattooaoacctctccttcc 

ocGatgotgocctatatgtgcGatgggagaatccacogoatUtattagcagatgcctottttatgaogtagaagtcaat 

aacogccaaoctgagacacataotgttttctQcgtccaagaggctaQatgtgagoatccagaatttgagagaootgtgga 

gaatocotcttgtttcotggtccctggtgttcttcctgatactttgoacocagtcagaotaagagtcoaaocoaataagt 

tatgctQtgoggatgacoaactctggogtaaLtggagccQagQQatgogtatQggtaogaagcgcQattccQcactctQC 

Qtaaccatgt tact cat tgttccagtcatcgtcgcaggtgcaatca tag toe tec tgctttacctaaaaaggctcaagat 

tattotottccctccaottcctgotcctggcaogatttttoaagaaotgtttggogoccagoatgatgatactctgcact 

ggaogoQgtocgocotctatgogaagcaaaccaaggaggaaQccgactctgtagtgctgatogaaaocctgoagaoogcc 

tctcagtgatggagotoat t tot t tttacct tcactgtgacct tgagaogot tcttcccat tctccott tgt tatctggg 

oacttottaQatggoaoctgaaoctQctgcaccGtttaaaaacaggcagctcataagagccocaggtctttatgttgagt 

cgcgcoccgaoaaoctGaaoQtaatgggcgctttggogoagogtgtggagtcottctcottgaattotaQQagccogcag 

gcttcaoactaggggacooagcaaaoagtgatgatogtggtggagttQatcttatcaagagttgtgocaacttcctgagg 

gate tatocttgctttgtgttctttgtgtcoocatgaacoaattt tat ttgtaggggooc teat ttggggtgcaaatgct 

aatgtcaaacttgagtcacaaagoacatgtogaoaacoaaatggotoaaatctgatatgtattgtttgggatcctattgo 

occatgtttgtggctattaaaactcttttaacagtctgggctgggtccggtggctcacgcctgtaatcccagcaatttgg 

gagtccgaggcgggcggatcactcgagctgcaggcatgcaagcttggcactggccgtcgttttacaocgtcgtgcctggg 

aaaoccctggcgttacccaacttoatcgccttgcagcacatccccctttcgccagccttgggCCCAATGATCCGACCAGC 

AAAACTCGATGTACTTCCGAGGAACTGATGTGCATAATGCATCAGGCTGGTACATTAGATCCCCGCTTACCGCGGGCAAT 

ATAGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGTgCATAATGCTGCGCaGTGTTGCCACATAACCACTATA 

TTAACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCTGAGGAAGCGTGGTGCATAATGCCACGCAGCGTCTGCA 

TAACTTTTATTATTTCTTTTATTAATCAACAAAATTTTGTTTTTAACATTTCAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAAGGGAATTCctcgattaattaagcggccgctcgaGGGGAATTAATTCTTGAAGACGAAAGGGCCAGGTGGCA 

CTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAA 

TAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTT 

TTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTG 

CACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATG 

ATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCAT 
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ACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAAT 
TATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTA 
ACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAA 
CGACGAGCGTGACACCACGATGCCTGtAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAG 
CTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGC 
TGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCC 
CTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTG 
CCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAA 
TTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGC 
GTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAA 
AACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGA 
GCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATA 
CCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGAT 
AGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACC 
GAACTGAGATACCTACAGCGTGAGCATTGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAG 
CGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTC 
GCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGAGC 
Tcgtotggacotattgtcgttagaocgcggctacaottootocotoaccttotgtotcotococotacgatttaggggoc 
actatag 
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GGATATAGTGGTGAGTATCCCCGCCTGTCACGCGGGAGACCGGGGTTCGGTTCCCCGACGGGGAGCCAAACAGCCGACCA 

ATTGCACTACCATCACAATGGAGAAGCCAGTAGTAAACGTAGACGTAGACCCCCAGAGTCCGTTTGTCGTGCAACTGCAA 

AAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGTCACTCCAAATGACCATGCTAATGCCAGAGCATTTTCGCATCT 

GGCCAGTAMCTAATCGAGCTGGAGGTTCCTACCACAGCGACGATCTTGGACATAGGCAGCGCACCGGCTCGTAGAATGT 

TTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGTAGTCCAGAAGACCCGGACCGCATGATGAAATACGCCAGTAAA 

CTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTTGCATGAGAAGATTAAGGATCTCCGGACCGTACTTGATACGCC 

GGATGCTGAAACACCATCGCTCTGCTTTCACAACGATGTTACCTGCAACATGCGTGCCGAATATTCCGTCATGCAGGACG 

TGTATATCAACGCTCCCGGAACTATCTATCATCAGGCTATGAAAGGCGTGCGGACCCTGTACTGGATTGGCTTCGACACC 

ACCCAGTTCATGTTCTCGGCTATGGCAGGTTCGTACCCTGCGTACAACACCAACTGGGCCGACGAGAAAGTCCTTGAAGC 

GCGTAACATCGGACTTTGCAGCACAAAGCTGAGTGAAGGTAGGACAGGAAAATTGTCGATAATGAGGAAGAAGGAGTTGA 

AGCCCGGGTCGCGGGTTTATTTCTCCGTAGGATCGACACTTTATCCAGAACACAGAGCCAGCTTGCAGAGCTGGCATCTT 

CCATCGGTGTTCCACTTGAATGGAAAGCAGTCGTACACTTGCCGCTGTGATACAGTGGTGAGTTGCGAAGGCTACGTAGT 

GAAGAAAATCACCATCAGTCCCGGGATCACGGGAGAAACCGTGGGATACGCGGTTACACACAATAGCGAGGGCTTCTTGC 

TATGCAAAGTTACTGACACAGTAAAAGGAGAACGGGTATCGTTCCCTGTGTGCACGTACATCCCGGCCACCATATGCGAT 

CAGATGACTGGTATAATGGCCACGGATATATCACCTGACGATGCACAAAAACTTCTGGTTGGGCTCAACCAGCGAATTGT 

CATTAACGGTAGGACTAACAGGAACACCAACACCATGCAAAATTACCTTCTGCCGATCATAGCACAAGGGTTCAGCAAAT 

GGGCTAAGGAGCGCAAGGATGATCTTGATAACGAGAAAATGCTGGGTACTAGAGAACGCAAGCTTACGTATGGCTGCTTG 

TGGGCGTTTCGCACTAAGAAAGTACATTCGTTTTATCGCCCACCTGGAACGCAGACCTGCGTAAAAGTCCCAGCCTCTTT 

TAGCGATCCCCTGAAAAGGCTGTTTAAGTTGGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCG 

CTCTGCTAGATGAAACAAAGGCGTGGTTTAGAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAG 

GTAGACAATATTACACCTGTCCTACTGGCATTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCAICAGAGGGGA 

AATAAAGCATCTCTACGGTGGTCCTAAATAGTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCATGA 

ATAGAGGATTCTTTAACATGCTCGGCCGCCGCCCCTTCCCGGCCCCCACTGCCATGTGGAGGCCGCGGAGAAGGAGGCAG 

GCGGCCCCGATGCCTGCCCGCAACGGGCTGGCTTCTCAAATCCAGCAACTGACCACAGCCGTCAGTGCCCTAGTCATTGG 

ACAGGCAACTAGACCTCAACCCCCACGTCCACGCCCGCCACCGCGCCAGAAGAAGCAGGCGCCCAAGCAACCACCGAAGC 

CGAAGAAACCAAAAACGCAGGAGAAGAAGAAGAAGCAACCTGCAAAACCCAAACCCGGAAAGAGACAGCGCATGGCACTT 

AAGTTGGAGGCCGACAGATTGTTCGACGTCAAGAACGAGGACGGAGATGTCATCGGGCACGCACTGGCCATGGAAGGAAA 

GGTAATGAAACCTCTGCACGTGAAAGGAACCATCGACCACCCTGTGCTATCAAAGCTCAAATTTACCAAGTCGTCAGCAT 

ACGACATGGAGTTCGCACAGTTGCCAGTCAACATGAGAAGTGAGGCATTCACCTACACCAGTGAACACCCCGAAGGATTC 

TATAACTGGCACCACGGAGCGGTGCAGTATAGTGGAGGTAGATTTACCATCCCTCGCGGAGTAGGAGGCAGAGGAGACAG 

CGGTCGTCCGATCATGGATAACTCCGGTCGGGTTGTCGCGATAGTCCTCGGTGGCGCTGATGAAGGAACACGAACTGCCC 

TTTCGGTCGTCACCTGGAATAGTAAAGGGAAGACAATTAAGACGACCCCGGAAGGGACAGAAGAGTGGTCCGCAGCACCA 

CTGGTCACGGCAATGTGTTTGCTCGGAAATGTGAGCTTCCCATGCGACCGCCCGCCCACATGCTATACCCGCGAACCTTC 

CAGAGCCCTCGACATCCTTGAAGAGAACGTGAACCATGAGGCCTACGATACCCTGCTCAATGCCATATTGCGGTGCGGAT 

CGTCTGGCAGAAGCAAAAGAAGCGTCATTGACGACTTTACCCTGACCAGCCCCTACTTGGGCACATGCTCGTACTGCCAC 

CATACTGTACCGTGCTTCAGCCCTGTTAAGATCGAGCAGGTCTGGGACGAAGCGGACGATAACACCATACGCATACAGAC 

TTCCGCCCAGTTTGGATACGACCAAAGCGGAGCAGCAAGCGCAAACAAGTACCGCTACATGTCGCTTAAGCAGGATCACA 

CCGTTAAAGAAGGCACCATGGATGACATCAAGATTAGCACCTCAGGACCGTGTAGAAGGCTTAGCTACAAAGGATACTTT 

CTCCTCGCAAAATGCCCTCCAGGGGACAGCGTAACGGTTAGCATAGTGAGTAGCAACTCAGCAACGTCATGTACACTGGC 

CCGCAAGATAAAACCAAAATTCGTGGGACGGGAAAAATATGATCTACCTCCCGTTCACGGTAAAAAAATTCCTTGCACAG 

TGTACGACCGTCTGAAAGAAACAACTGCAGGCTACATCACTATGCACAGGCCGAGACCGCACGCTTATACATCCTACCTG 

GAAGAATCATCAGGGAAAGTTTACGCAAAGCCGCCATCTGGGAAGAACATTACGTATGAGTGCAAGTGCGGCGACTACAA 

GACCGGAACCGTTTCGACCCGCACCGAAATCACTGGTTGCACCGCCATCAAGCAGTGCGTCGCCTATAAGAGCGACCAAA 

CGAAGTGGGTCTTCAACTCACCGGACTTGATCAGACATGACGACCACACGGCCCAAGGGAAATTGCATTTGCCTTTCAAG 

TTGATCCCGAGTACCTGCATGGTCCCTGTTGCCCACGCGCCGAATGTAATACATGGCTTTAAACACATCAGCCTCCAATT 
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AGATACAGACCACTTGACATTGCTCACCACCAGGAGACTAGGGGCAAACCCGGAACCAACCACTGAATGGATCGTCGGAA 

AGACGGTCAGAAACTTCACCGTCGACCGAGATGGCCTGGAATACATATGGGGAAATCATGAGCCAGTGAGGGTCTATGCC 

CAAGAGTCAGCACCAGGAGACCCTCACGGATGGCCACACGAAATAGTACAGCATTACTACCATCGCCATCCTGTGTACAC 

CATCTTAGCCGTCGCATCAGCTACCGTGGCGATGATGATTGGCGTAACTGTTGCAGTGTTATGTGCCTGTAAAGCGCGCC 

GTGAGTGCCTGACGCCATACGCCCTGGCCCCAAACGCCGTAATCCCAACTTCGCTGGCACTCTTGTGCTGCGTTAGGTCG 

GCCAATGCTGAAACGTTCACCGAGACCATGAGTTACTTGTGGTCGAACAGTCAGCCGTTCT7CTGGGTCCAGTTGTGCAT 

ACCTTTGGCCGCTTTCATCGTTCTAATGCGCTGCTGCTCCTGCTGCCTGCCTTTTTTAGTGGTTGCCGGCGCCTACCTGG 

CGAAGGTAGACGCCTACGAACATGCGACCACTGTTCCAAATGTGCCACAGATACCGTATAAGGCACTTGTTGAAAGGGCA 

GGGTATGCCCCGCTCAATTTGGAGATCACTGTCATGTCCTCGGAGGTTTTGCCTTCCACCAACCAAGAGTACATTACCTG 

CAAATTCACCACTGTGGTCCCCTCCCCAAAAATCAAATGCTGCGGCTCCTTGGAATGTCAGCCGGCCGCTCATGCAGACT 

ATACCTGCAAGGTCTTCGGAGGGGTCTACCCCTTTATGTGGGGAGGAGCGCAATGTTTTTGCGACAGTGAGAACAGCCAG 

ATGAGTGAGGCGTACGTCGAATTGTCAGCAGATTGCGCGTCTGACCACGCGCAGGCGATTAAGGTGCACACTGCCGCGAT 

GAAAGTAGGACTGCGTATTGTGTACGGGAACACTACCAGTTTCCTAGATGTGTACGTGAACGGAGTCACACCAGGAACGT 

CTAAAGACTTGAAAGTCATAGCTGGACCAATTTCAGCATCGTTTACGCCATTCGATCATAAGGTCGTTATCCATCGCGGC 

CTGGrGTACAACTATGACTTCCCGGAATATGGAGCGATGAAACCAGGAGCGTTTGGAGACATTCAAGCTACCTCCTTGAC 

TAGCAAGGATCTCATCGCCAGCACAGACATTAGGCTACTCAAGCCTTCCGCCAAGAACGTGCATGTCCCGTACACGCAGG 

CCTCATCAGGATTTGAGATGTGGAAAAACAACTCAGGCCGCCCACTGCAGGAAACCGCACCTTTCGGGTGTAAGATTGCA 

GTAAATCCGCTCCGAGCGGTGGACTGTTCATACGGGAACATTCCCATTTCTATTGACATCCCGAACGCTGCCTTTATCAG 

GACATCAGATGCACCACTGGTCTCAACAGTCAAATGTGAAGTCAGTGAGTGCACTTATTCAGCAGACTTCGGCGGGATGG 

CCACCCTGCAGTATGTATCCGACCGCGAAGGTCAATGCCCCGTACATTCGCATTCGAGCACAGCAACTCTCCAAGAGTCG 

ACAGTACATGTCCTGGAGAAAGGAGCGGTGACAGTACACTTTAGCACCGCGAGTCCACAGGCGAACTTTATCGTATCGCT 

GTGTGGGAAGAAGACAACATGCAATGCAGAATGTAAACCACCAGCTGACCATATCGTGAGCACCCCGCACAAAAATGACC 

AAGAATTTCAAGCCGCCATCTCAAAAACATCATGGAGTTGGCTGTTTGCCCTTTTCGGCGGCGCCTCGTCGCTATTAATT 

ATAGGACTTATGATTTTTGCTTGCAGCATGATGCTGACTAGCACACGAAGATGACCGCTACGCCCCAATGATCCGACCAG 

CAAAACTCGATGTACTTCCGAGGAACTGATGTGCATAATGCATCAGGCTGGTACATTAGATCCCCGCTTACCGCGGGCAA 

TATAGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGTgCATAATGCTGCGCoGTGTTGCCACATAACCACTAT 

ATTAACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCTGAGGAAGCGTGGTGCATAATGCCACGCAGCGTCTGC 

ATAACTTTTATTATTTCTTTTATTAATCAACAAAATTTTGTTTTTAACATTTCAAAAAAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAAAAGGGAATTCctcGAGGGGAATTAATTCTTGAAGACGAAAGGGCCTCGTGATACGCCTATTTTTATAGGTTA 

ATGTCATGATAATAATGGTTTCTTAGACGTCAGGTGGCACTTTTCGGGGAAATGTGCGCGGAACCCCTATTTGTTTATTT 

TTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAG 

TATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAA 

CGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAG 

ATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATC 

CCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCA 

CAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCC 

AACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCT 

TGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAA 

CGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAA 

GTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGGTC 

TCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAA 

CTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAACTGTCAGACCAAGTTTAC 

TCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGATCCTTTTTGATAATCTCAT 

GACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATC 
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CTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAG 
CTACCAACTGTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTT 
AGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTG 
GCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGT 
TCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCATTGAGAAAGCGCCAC 
GCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTMGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAG 
GGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCA 
GGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGAGCTCGGATCTGGCTAGCGATGACCCTGCTGATTGGTTCGCTGA 
CCATTTCCGGGTGCGGGACGGCGTTACCAGAAACTCAGAAGGTTCGTCCAACCAAACCGACTCTGACGGCAGTTTACGAG 
AGAGATGATAGGGTCTGCATCAGTAAGCCAGATGCTACACAATTAGGCTTGTACATATTGTCGTTAGAACGCGGCTACAA 
TTAATACATAACCTTATGTATCATACACATACGATTTAGGTGACACTATA 
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ATTGACGGCGTAGTACACACTATTGAATCAAACAGCCGACCAATTGCACTACCATCACAATGGAGAAGCCAGTAGTAAAC 

GTAGACGJAGACCCCCAGAGTCCGTTTGTCGTGCAACTGCAAAAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGT 

CACTCCAAATGACCATGCTAATGCCAGAGCATTTTCGCATCTGGCCAGTAAACTAATCGAGCTGGAGGTTCCTACCACAG 

CGACGATCTTGGACATAGGCAGCGCACCGGCTCGTAGAATGTTTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGT 

AGTCCAGAAGACCCGGACCGCATGATGAAATACGCCAGTAAACTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTT 

GCATGAGAAGATTAAGGATCTCCGGACCGTACTTGATACGCCGGATGCTGAAACACCATCGCTCTGCTTTCACAACGATG 

TTACCTGCAACATGCGTGCCGAATATTCCGTCATGCAGGACGTGTATATCAACGCTCCCGGAACTATCTATCATCAGGCT 

ATGAAAGGCGTGCGGACCCTGTACTGGATTGGCTTCGACACCACCCAGTTCATGTTCTCGGCTATGGCAGGTTCGTACCC 

TGCGTACAACACCAACTGGGCCGACGAGAAAGTCCTTGAAGCGCGTAACATCGGACTTTGCAGCACAAAGCTGAGTGAAG 

GTAGGACAGGAAAATTGTCGATAATGAGGAAGAAGGAGTTGAAGCCCGGGTCGCGGGTTTATTTCTCCGTAGGATCGACA 

CTTTATCCAGAACACAGAGCCAGCTTGCAGAGCTGGCATCTTCCATCGGTGTTCCACTTGAATGGAAAGCAGTCGTACAC 

TTGCCGCTGTGATACAGTGGTGAGTTGCGAAGGCTACGTAGTGAAGAAAATCACCATCAGTCCCGGGATCACGGGAGAAA 

CCGTGGGATACGCGGTTACACACAATAGCGAGGGCTTCTTGCTATGCAAAGTTACTGACACAGTAAAAGGAGAACGGGTA 

TCGTTCCCTGTGTGCACGTACATCCCGGCCACCATATGCGATCAGATGACTGGTATAATGGCCACGGATATATCACCTGA 

CGATGCACAAAAACTTCTGGTTGGGCTCAACCAGCGAATTGTCATTAACGGTAGGACTAACAGGAACACCAACACCATGC 

AAAATTACCTTCTGCCGATCATAGCACAAGGGTTCAGCAAATGGGCTAAGGAGCGCAAGGATGATCTTGATAACGAGAAA 

ATGCTGGGTACTAGAGAACGCAAGCTTACGTATGGCTGCTTGTGGGCGTTTCGCACTAAGAAAGTACATTCGTTTTATCG 

CCCACCTGGAACGCAGACCTGCGTAAMGTCCCAGCCTCTTTTAGCGCTTTTCCCATGTCGTCCGTATGGACGACCTCTT 

TGCCCATGTCGCTGAGGCAGAAATTGAAACTGGCATTGCAACCAAAGAAGGAGGAAAAACTGCTGCAGGTCTCGGAGGAA 

TTAGTCATGGAGGCCAAGGCTGCTTTTGAGGATGCTCAGGAGGAAGCCAGAGCGGAGAAGCTCCGAGAAGCACTTCCACC 

ATTAGTGGCAGACAAAGGCATCGAGGCAGCCGCAGAAGTTGTCTGCGAAGTGGAGGGGCTCCAGGCGGACATCGGAGCAG 

CATTAGTTGAAACCCCGCGCGGTCACGTAAGGATAATACCTCAAGCAAATGACCGTATGATCGGACAGTATATCGTTGTC 

TCGCCAAACTCTGTGCTGAAGAATGCCAAACTCGCACCAGCGCACCCGCTAGCAGATCAGGTTAAGATCATAACACACTC 

CGGAAGATCAGGAAGGTACGCGGTCGAACCATACGACGCTAAAGTACTGATGCCAGCAGGAGGTGCCGTACCATGGCCAG 

AATTCCTAGCACTGAGTGAGAGCGCCACGTTAGTGTACAACGAAAGAGAGTTTGTGAACCGCAAACTATACCACATTGCC 

ATGCATGGCCCCGCCAAGAATACAGAAGAGGAGCAGTACAAGGTTACAAAGGCAGAGCTTGCAGAAACAGAGTACGTGTT 

TGACGTGGACAAGAAGCGTTGCGTTAAGAAGGAAGAAGCCTCAGGTCTGGTCCTCTCGGGAGAACTGACCAACCCTCCCT 

ATCATGAGCTAGCTCTGGAGGGACTGAAGACCCGACCTGCGGTCCCGTACAAGGTCGAAACAATAGGAGTGATAGGCACA 

CCGGGGTCGGGCAAGTCAGCTATTATCAAGTCAACTGTCACGGCACGAGATCTTGTTACCAGCGGAAAGAAAGAAAATTG 

TCGCGAAATTGAGGCCGACGTGCTAAGACTGAGGGGTATGCAGATTACGTCGAAGACAGTAGATTCGGTTATGCTCAACG 

GATGCCACAAAGCCGTAGAAGTGCTGTACGTTGACGAAGCGTTCGCGTGCCACGCAGGAGCACTACTTGCCTTGATTGCT 

ATCGTCAGGCCCCGCAAGAAGGTAGTACTATGCGGAGACCCCATGCAATGCGGATTCTTCAACATGATGCAACTAAAGGT 

ACATTTCAATCACCCTGAAAAAGACATATGCACCAAGACATTCTACAAGTATATCTCCCGGCGTTGCACACAGCCAGTTA 

CAGCTATTGTATCGACACTGCATTACGATGGAAAGATGAAAACCACGAACCCGTGCAAGAAGAACATTGAAATCGATATT 

ACAGGGGCCACAAAGCCGAAGCCAGGGGATATCATCCTGACATGTTTCCGCGGGTGGGTTAAGCAATTGCAAATCGACTA 

TCCCGGACATGAAGTAATGACAGCCGCGGCCTCACAAGGGCTAACCAGAAAAGGAGTGTATGCCGTCCGGCAAAAAGTCA 

ATGAAAACCCACTGTACGCGATCACATCAGAGCATGTGAACGTGTTGCTCACCCGCACTGAGGACAGGCTAGTGTGGAAA 

ACCTTGCAGGGCGACCCATGGATTAAGCAGCCCACTAACATACCTAAAGGAAACTTTCAGGCTACTATAGAGGACTGGGA 

AGCTGAACACAAGGGAATAATTGCTGCAATAAACAGCCCCACTCCCCGTGCCAATCCGTTCAGCTGCAAGACCAACGTTT 

GCTGGGCGAAAGCATTGGAACCGATACTAGCCACGGCCGGTATCGTACTTACCGGTTGCCAGTGGAGCGAACTGTTCCCA 

CAGTTTGCGGATGACAAACCACATTCGGCCATTTACGCCTTAGACGTAATTTGCATTAAGTTTTTCGGCATGGACTTGAC 

AAGCGGACTGTTTTCTAAACAGAGCATCCCACTAACGTACCATCCCGCCGATTCAGCGAGGCCGGTAGCTCATTGGGACA 

ACAGCCCAGGAACCCGCAAGTATGGGTACGATCACGCCATTGCCGCCGAACTCTCCCGTAGATTTCCGGTGTTCCAGCTA 

GCTGGGAAGGGCACACAACTTGATTTGCAGACGGGGAGAACCAGAGTTATCTCTGCACAGCATAACCTGGTCCCGGTGAA 

CCGCAATCTTCCTCACGCCTTAGTCCCCGAGTACAAGGAGAAGCAACCCGGCCCGGTCAAAAAATTCTTGAACCAGTTCA 
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aacaccactcagtacttgtggtatcagaggaaaaaattgaagctccccgtaagagaatcgaatggatcgccccgattggc 

atagccggtgcagataagaactacaacctggctttcgggtttccgccgcaggcacggtacgacctggtgttcatcaacat 

tggaactaaatacagaaaccaccactttcagcagtgcgaagaccatgcggcgaccttaaaaaccctttcgcgttcggccc 

tgaattgccttaacccaggaggcaccctcgtggtgaagtcctatggctacgccgaccgcaacagtgaggacgtagtcacc 

gctcttgccagaaagtttgtcagggtgtctgcagcgagaccagattgtgtctcaagcaatacagaaatgtacctgatttt 

ccgacaactagacaacagccgtacacggcaattcaccccgcaccatctgaattgcgtgatttcgtccgtgtatgagggta 

caagagatggagttggagccgcgccgtcataccgcaccaaaagggagaatattgctgactgtcaagaggaagcagttgtc 

aacgcagccaatccgctgggtagaccaggcgaaggagtctgccgtgccatctataaacgttggccgaccagttttaccga 

ttcagccacggagacaggcaccgcaagaatgactgtgtgcctaggaaagaaagtgatccacgcggtcggccctgatttcc 

ggaagcacccagaagcagaagccttgaaattgctacaaaacgcctaccatgcagtggcagacttagtaaatgaacataac 

atcaagtctgtcgccattccactgctatctacaggcatttacgcagccggaaaagaccgccttgaagtatcacttaactg 

cttgacaaccgcgctagacagaactgacgcggacgtaaccatctattgcctggataagaagtggaaggaaagaatcgacg 

cggcactccaacttaaggagtctgtaacagagctgaaggatgaagatatggagatcgacgatgagttagtatggattcat 

ccagacagttgcttgaagggaagaaagggattcagtactacaaaaggaaaattgtattcgtacttcgaaggcaccaaatt 

ccatcaagcagcaaaagacatggcggagataaaggtcctgttccctaatgaccaggaaagtaatgaacaactgtgtgcct 

acatattgggtgagaccatggaagcaatccgcgaaaagtgcccggtcgaccataacccgtcgtctagcccgcccaaaacg 

ttgccgtgcctttgcatgtatgccatgacgccagaaagggtccacagacttagaagcaataacgtcaaagaagttacagt 

atgctcctccaccccccttcctaagcacaaaattaagaatgttcagaaggttcagtgcacgaaagtagtcctgtttaatc 

cgcacactcccgcattcgttcccgcccgtaag'tacatagaagtgccagaacagcctaccgctcctcctgcacaggccgag 

gaggcccccgaagttgtagcgacaccgtcaccatctacagctgataacacctcgcttgatgtcacagacatctcactgga 

tatggatgacagtagcgaaggctcacttttttcgagctttagcggatcggacaactctattactagtatggacagttggt 

cgtcaggacctagttcactagagatagtagaccgaaggcaggtggtggtggctgacgttcatgccgtccaagagcctgcc 

cctattccaccgccaaggctaaagaagatggcccgcctggcagcggcaagaaaagagcccactccaccggcaagcaatag 

ctctgagtccctccacctctcttttggtggggtatccatgtccctcggatcaattttcgacggagagacggcccgccagg 

cagcggtacaacccctggcaacaggccccacggatgtgcctatgtctttcggatcgttttccgacggagagattgatgag 

ctgagccgcagagtaactgagtccgaacccgtcctgtttggatcatttgaaccgggcgaagtgaactcaattatatcgtc 

ccgatcagccgtatcttttccactacgcaagcagagacgtagacgcaggagcaggaggactgaatactgactaaccgggg 

taggtgggtacatattttcgacggacacaggccctgggcacttgcaaaagaagtccgttctgcagaaccagcttacagaa 

ccgaccttggagcgcaatgtcctggaaagaattcatgccccggtgctcgacacgtcgaaagaggaacaactcaaactcag 

gtaccagatgatgcccaccgaagccaacaaaagtaggtaccagtctcgtaaagtagaaaatcagaaagccataaccactg 

agcgactactgtcaggactacgactgtataactctgccacagatcagccagaatgctataagatcacctatccgaaacca 

ttgtactccagtagcgtaccggcgaactactccgatccacagttcgctgtagctgtctgtaacaactatctgcatgagaa 

ctatccgacagtagcatcttatcagattactgacgagtacgatgcttacttggatatggtagacgggacagtcgcctgcc 

tggatactgcaaccttctgccccgctaagcttagaagttacccgaaaaaacatgagtatagagccccgaatatccgcagt 

gcggttccatcagcgatgcagaacacgctacaaaatgtgctcattgccgcaactaaaagaaattgcaacgtcacgcagat 

gcgtgaactgccaacactggactcagcgacattcaatgtcgaatgctttcgaaaatatgcatgtaatgacgagtattggg 

aggagttcgctcggaagccaattaggattaccactgagtttgtcaccgcatatgtagctagactgaaaggccctaaggcc 

gccgcactatttgcaaagacgtataatttggtcccattgcaagaagtgcctatggatagattcgtcatggacatgaaaag 

agacgtgaaagttacaccaggcacgaaacacacagaagaaagaccgaaagtacaagtgatacaagccgcagaacccctgg 

cgactgcttacttatgcgggattcaccgggaattagtgcgtaggcttacggccgtcttgcttccaaacattcacacgctt 

tttgacatgtcggcggaggattttgatgcaatcatagcagaacacttcaagcaaggcgacccggtactggagacggatat 

cgcatcattcgacaaaagccaagacgacgctatggcgttaaccggtctgatgatcttggaggacctgggtgtggatcaac 

cactactcgacttgatcgagtgcgcctttggagaaatatcatccacccatctacctacgggtactcgttttaaattcggg 

gcgatgatgaaatccggaatgttcctcacactttttgtcaacacagttttgaatgtcgttatcgccagcagagtactaga 
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AGAGCGGCTTAAAACGTCCAGATGTGCAGCGTTCATTGGCGACGACAACATCATACATGGAGTAGTATCTGACAAAGAAA 

TGGCTGAGAGGTGCGCCACCTGGCTCAACATGGAGGTTAAGATCATCGACGCAGTCATCGGTGAGAGACCACCTTACTTC 

TGCGGCGGATTTATCTTGCAAGATTCGGTTACTTCCACAGCGTGCCGCGTGGCGGATCCCCTGAAAAGGCTGTTTAAGTT 

GGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCTCTGCTAGATGAAACAAAGGCGTGGTTTA 

GAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAGGTAGACAATATTACACCTGTCCTACTGGCA 

TTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAATAAAGCATCTCTACGGTGGTCCTAAATA 

GTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCtctagaccatggggtoccgagctcgaottcgCCT 

CGTCGCTATTAATTATAGGACTTATGATTTTTGCTTGCAGCATGATGCTGACTAGCACACGAAGATGACgggcccAGGTA 

GACAATATTACACCTGTCCTACTGGCATTGAGAACTTTTGCCGAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAAT 

AAAGCATCTCTACGGTGGTCCTAAATAGTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCATGAATA 

GAGGATTCTTTAACATGCTCGGCCGCCGCCCCTTCCCGGCCCCCACTGCCATGTGGAGGCCGCGGAGAAGGAGGCAGGCG 

GCCCCGATGCCTGCCCGCAACGGGCTGGCTTCTCAAATCCAGCAACTGACCACAGCCGTCAGTGCCCTAGTCATTGGACA 

GGCAACTAGACCTCAACCCCCACGTCCACGCCCGCCACCGCGCCAGAAGAAGCAGGCGCCCAAGCAACCACCGAAGCCGA 

AGAAACCAAAAACGCAGGAGAAGAAGAAGAAGCAACCTGCAAAACCCAAACCCGGAAAGAGACAGCGCATGGCACTTAAG 

TTGGAGGCCGACAGATTGTTCGACGTCAAGAACGAGGACGGAGATGTCATCGGGCACGCACTGGCCATGGMGGAAAGGT 

AATGAAACCTCTGCACGTGAAAGGAACCATCGACCACCCTGTGCTATCAAAGCTCAAATTTACCAAGTCGTCAGCATACG 

ACATGGAGTTCGCACAGTTGCCAGTCAACATGAGAAGTGAGGCATTCACCTACACCAGTGAACACCCCGAAGGATTCTAT 

AACTGGCACCACGGAGCGGTGCAGTATAGTGGAGGTAGATTTACCATCCCTCGCGGAGTAGGAGGCAGAGGAGACAGCGG 

TCGTCCGATCATGGATAACTCCGGTCGGGTTGTCGCGATAGTCCTCGGTGGCGCTGATGAAGGAACACGAACTGCCCTTT 

CGGTCGTCACCTGGAATAGTAAAGGGAAGACAATTAAGACGACCCCGGAAGGGACAGAAGAGTGGTCCGCAGCACCACTG 

GTCACGGCAATGTGTTTGCTCGGAAATGTGAGCTTCCCATGCGACCGCCCGCCCACATGCTATACCCGCGAACCTTCCAG 

AGCCCTCGACATCCTTGAAGAGAACGTGAACCATGAGGCCTACGATACCCTGCTCAATGCCATATTGCGGTGCGGATCGT 

CTGGCAGAAGCAAAAGAAGCGTCATcGACGACTTTACCCTGACCAGCCCCTACTTGGGCACATGCTCGTACTGCCACCAT 

ACTGaACCGTGCTTCAGCCCTGTTAAGATCGAGCAGGTCTGGGACGAAGCGGACGATAACACCATACGCATACAGACTTC 

CGCCCAGTTTGGATACGACCAtAGCGGAGCAGCAAGCGCAAACAAGTACCGCTACATGTCGCTTAAGCAGGATCACACCG 

TTAAAGAAGGCACCATGGATGACATCAAGATTAGCACCTCAGGACCGTGTAGAAGGCTTAGCTACAAAGGATACTTTCTC 

CTCGCAAAATGCCCTCCAGGGGACAGCGTAACGGTTAGCATAGTGAGTAGCAACTCAGCAACGTCATGTACACTGGCCCG 

CAAGATAAAACCAAAATTCGTGGGACGGGAAAAATATGATCTACCTCCCGTTCACGGTAAAAAAATTCCTTGCACAGTGT 

ACGACCGTCTGAAAGAAACAACTGCAGGCTACATCACTATGCACAGGCCGgGACCGCACGCTTATACATCCTACCTGGAA 

GAATCATCAGGGAMGTTTACGCAAAGCCGCCATCTGGGAAGAACATTACGTATGAGTGCAAGTGCGGCGACTACAAGAC 

CGGAACCGTTTCGACCCGCACCGAAATCACTGGTTGCACCGCCATCAAGCAGTGCGTCGCCTATAAGAGCGACCAAACGA 

AGTGGGTCTTCAACTCACCGGACTTGATCAGACATGACGACCACACGGCCCAAGGGAAATTGCATTTGCCTTTCAAGTTG 

ATCCCGAGTACCTGCATGGTCCCTGTTGCCCACGCGCCGAATGTAATACATGGCTTTAAACACATCAGCCTCCAATTAGA 

TACAGACCACTTGACATTGCTCACCACCAGGAGACTAGGGGCAAACCCGGAACCAACCACTGAATGGATCGTCGGAAAGA 

CGGTCAGAAACTTCACCGTCGACCGAGATGGCCTGGAATACATATGGGGAAATCATGAGCCAGTGAGGGTCTATGCCCAA 

GAGTCAGCACCAGGAGACCCTCACGGATGGCCACACGAAATAGTACAGCATTACTACCATCGCCATCCTGTGTACACCAT 

CTTAGCCGTCGCATCAGCTACCGTGGCGATGATGATTGGCGTAACTGTTGCAGTGTTATGTGCCTGTAAAGCGCGCCGTG 

AGTGCCTGACGCCATACGCCCTGGCCCCAAACGCCGTAATCCCAACTTCGC.TGGCACTCTTGTGCTGCGTTAGGTCGGCC 

AATGCTGAAACGTTCACCGAGACCATGAGTTACTTGTGGTCGAACAGTCAGCCGTTCTTCTGGGTCCAGTTGTGCATACC 

TTTGGCCGCTTTCATCGTTCTAATGCGCTGCTGCTCCTGCTGCCTGCCTTTTTTAGTGGTTGCCGGCGCCTACCTGGCGA 

AGGTAGACGCCTACGAACATGCGACCACTGTTCCAAATGTGCCACAGATACCGTATAAGGCACTTGTTGAAAGGGCAGGG 

TATGCCCCGCTCAATTTGGAGATCACTGTCATGTCCTCGGAGGTTTTGCCTTCCACCAACCAAGAGTACATTACCTGCAA 

ATTCACCACTGTGGTCCCCTCCCCAAAAATCAAATGCTGCGGCTCCTTGGAATGTCAGCCGGCCGCTCATGCAGACTATA 

CCTGCAAGGTCTTCGGAGGGGTCTACCCCTTTATGTGGGGAGGAGCGCAATGTTTTTGCGACAGTGAGAACAGCCAGATG 
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AGTGAGGCGTACGTCGAATTGTCAGCAGATTGCGCGTCTGACCACGCGCAGGCGATTAAGGTGCACACTGCCGCGATGAA 

AGTAGGACTGCGTATTGTGTACGGGAACACTACCAGTTTCCTAGATGTGTACGTGAACGGAGTCACACCAGGAACGTCTA 

AAGACTTGAAAGTCATAGCTGGACCAATTTCAGCATCoTTTACGCCATTCGATCATAAGGTCGTTATCCATCGCGGCCTG 

GTGTACAACTATGACTTCCCGGAATATGGAGCGATGAAACCAGGAGCGTTTGGAGACATTCAAGCTACCTCCTTGACTAG 

CAAGGATCTCATCGCCAGCACAGACATTAGGCTACTCAAGCCTTCCGCCAAGAAtGTGCATGTCCCGTACACGCAGGCCg 

CATCAGGATTTGAGATGTGGAAAAACAACTCAGGCCGCCCAtTGCAGGAAACCGCACCTTTCGGGTGTAAGATTGCAGTA 

AATCCGCTCCGAGCGGTGGACTGTTCATACGGGAACATTCCCATTTCTATTGACATCCCGAACGCTGCCTTTATCAGGAC 

ATCAGATGCACCACTGGTCTCAACAGTCAAATGTGAAGTCAGTGAGTGCACTTATTCAGCAGACTTCGaCGGGATGGCCA 

CCCTGCAGTATGTATCCGACCGCGAAGGTCAATGCCCCGTACATTCGCATTCGAGCACAGCAACTCTCCAAGAGTCGACA 

GTACATGTCCTGGAGAAAGGAGCGGTGACAGTACACTTTAGCACCGCGAGTCCACAGGCGAACTTTATCGTATCGCTGTG 

TGGGAAGAAGACAACATGCAATGCAGAATGTAAACCACCAGCTGACCATATCGTGAGCACCCCGCACAAAAATGACCAAG 

AATTTCAAGCCGCCATCTCAAAAACATCATGGAGTTGGCTGTTTGCCCTTTTCGGCGGCGCCTCGTCGCTATTAATTATA 

GGACTTATGATTTTTGCTTGCAGCATGATGCTGACTAGCACACGAAGATGACCGCTACGCCCCAATGATCCGACCAGCAA 

AACTCGATGTACTTCCGAGGAACTGATGTGCATAATGCATCAGGCTGGTACATTAGATCCCCGCTTACCGCGGGCAATAT 

AGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGTqCATAATGCTGCGCoGTGTTGCCACATAACCACTATATT 

AACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCTGAGGAAGCGTGGTGCATAATGCCACGCAGCGTCTGCATA 

ACTTTTATTATTTCTTTTATTMTCMCAAMTTTTGTTTTTMCATTTCAAAAAAMAAAAAAAAAAAAAAAAAAAAAA 

AAAAAAAGGGAATTCctcGAGGGGAATTAATTCTTGAAGACGAAAGGGCCAGGTGGCACTTTTCGGGGAAATGTGCGCGG 

AACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAAT 

AATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCT 

GTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACT 

GGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGC 

TATGTGGCGCGGTATTATCCCGTGTTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTG 

GTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCAT 

GAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGG 

GGGATCATGTAACTCGCCTTGATCGTTGGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATG 

CCTGtAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGA 

CTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTG 

GAGCCGGTGAGCGTGGGTCTCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTAC 

ACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTA 

ACTGTCAGACCAAGTTTACTCATATATACTTTAGATTGATTTAAAACTTCATTTTTAATTTAAAAGGATCTAGGTGAAGA 

TCCTTTTTGATAATCTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATC 

AAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGT 

TTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCC 

TTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTA 

CCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCG 

GTCGGGCfGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTG 

AGCATTGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAG 

CGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCG 

ATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGAGCTCg totggocotottgtcgtto 

goocgcggctocoGttootocataoccttatgtotcatocacotacgotttaggggococtotog 
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CTGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCC 

CTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGG 

GCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTG 

GGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACT 

GGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAA 

TGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGCTTACAATTTCCATTCGCCATTCAGGCTGCG 

CAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGAT 

TAAGTTGGGTAACGCCAGGGTTTTCCCAGTCACGACGTTGTAAMCGACGGCCAGTGAgcgcgcoattoaccctcactaa 

ogggoocaooagc tggc tag tgGATCCAGTCTTATGCAATACTCTTGTAGTCTTGCAACATGGTAACGATGAGTTAGCAA 

CATGCCTTACAAGGAGAGAAAAAGCACCGTGCATGCCGATTGGTGGAAGTAAGGTGGTACGATCGTGCCTTATTAGGAAG 

GCAACAGACGGGTCTGACATGGATTGGACGAACCACTGAATTCCGCATTGCAGAGATATTGTATTTAAGTGCCCTACCTc 

gotaccgTCGAGATATAGTGGTGAGTATCCCCGCCTGTCACGCGGGAGACCGGGGTTCGGTTCCCCGACGGGGAGCCAAA 

CAGCCGACCAATTGCACTACCATCACAATGGAGAAGCCAGTAGTAAACGTAGACGTAGACCCCCAGAGTCCGTTTGTCGT 

GCAACTGCAAAAAAGCTTCCCGCAATTTGAGGTAGTAGCACAGCAGGTCACTCCAAATGACCATGCTAATGCCAGAGCAT 

TTTCGCATCTGGCCAGTAAACTAATCGAGCTGGAGGTTCCTACCACAGCGACGATCTTGGACATAGGCAGCGCACCGGCT 

CGTAGAATGTTTTCCGAGCACCAGTATCATTGTGTCTGCCCCATGCGTAGTCCAGAAGACCCGGACCGCATGATGAAATA 

CGCCAGTAAACTGGCGGAAAAAGCGTGCAAGATTACAAACAAGAACTTGCATGAGAAGATTAAGGATCTCCGGGATCCCC 

TGAAAAGGCTGTTTAAGTTGGGTAAACCGCTCCCAGCCGACGACGAGCAAGACGAAGACAGAAGACGCGCTCTGCTAGAT 

GAAACAAAGGCGTGGTTTAGAGTAGGTATAACAGGCACTTTAGCAGTGGCCGTGACGACCCGGTATGAGGTAGACAATAT 

TACACCTGTCCTACTGGCATTGAGAACTTTTGCCCAGAGCAAAAGAGCATTCCAAGCCATCAGAGGGGAAATAAAGCATC 

TCTACGGTGGTCCTAAATAGTCAGCATAGTACATTTCATCTGACTAATACTACAACACCACCACCATGAATAGAGGATTC 

TTTAACATGCTCGGCCGCCGCCCCTTCCCGGCCCCCACTGCCATGTGGAGGCCGCGGAGAAGGAGGCAGGCGGCCCCGAT 

GCCTGCCCGCAACGGGCTGGCTTCTCAAATCCAGCAACTGACCACAGCCGTCAGTGCCCTAGTCATTGGACAGGCAACTA 

GACCTCAACCCCCACGTCCACGCCCGCCACCGCGCCAGAAGAAGCAGGCGCCCAAGCAACCACCGAAGCCGAAGAAACCA 

AAAACGCAGGAGAAGAAGAAGAAGCAACCTGCAAAACCCAAACCCGGAAAGAGACAGCGCATGGCACTTAAGTTGGAGGC 

CGACAGATTGTTCGACGTCAAGAACGAGGACGGAGATGTCATCGGGCACGCACTGGCCATGGAAGGAAAGGTAATGAAAC 

CTCTGCACGTGAAAGGAACCATCGACCACCCTGTGCTATCAAAGCTCAAATTTACCAAGTCGTCAGCATACGACATGGAG 

TTCGCACAGTTGCCAGTCAACATGAGAAGTGAGGCATTCACCTACACCAGTGAACACCCCGAAGGATTCTATAACTGGCA 

CCACGGAGCGGTGCAGTATAGTGGAGGTAGATTTACCATCCCTCGCGGAGTAGGAGGCAGAGGAGACAGCGGTCGTCCGA 

TCATGGATAACTCCGGTCGGGUGTCGCGATAGTCCTCGGTGGCGCTGATGAAGGAACACGAACTGCCCTTTCGGTCGTC 

ACCTGGAATAGTAAAGGGAAGACAATTAAGACGACCCCGGAAGGGACAGAAGAGTGGTCCGCAGCACCACTGGTCACGGC 

AATGTGTTTGCTCGGAAATGTGAGCTTCCCATGCGACCGCCCGCCCACATGCTATACCCGCGAACCTTCCAGAGCCCTCG 

ACATCCTTGAAGAGAACGTGAACCATGAGGCCTACGATACCCTGCTCAATGCCATATTGCGGTGCGGATCGTCTGGCAGA 

AGCAAAAGAAGCGTCATTGACGACTTTACCCTGACCAGCCCCTACTTGGGCACATGCTCGTACTGCCACCATACTGTACC 

GTGCTTCAGCCCTGTTAAGATCGAGCAGGTCTGGGACGAAGCGGACGATAACACCATACGCATACAGACTTCCGCCCAGT 

TTGGATACGACCAAAGCGGAGCAGCAAGCGCAAACAAGTACCGCTACATGTCGCTTAAGCAGGATCACACCGTTAAAGAA 

GGCACCATGGATGACATCAAGATTAGCACCTCAGGACCGTGTAGAAGGCTTAGCTACAAAGGATACTTTCTCCTCGCAAA 

ATGCCCTCCAGGGGACAGCGTAACGGTTAGCATAGTGAGTAGCAACTCAGCAACGTCATGTACACTGGCCCGCAAGATAA 

AACCAAAATTCGTGGGACGGGAAAAATATGATCTACCTCCCGTTCACGGTAAAAAAATTCCTTGCACAGTGTACGACCGT 

CTGAAAGAAACAACTGCAGGCTACATCACTATGCACAGGCCGAGACCGCACGCTTATACATCCTACCTGGAAGAATCATC 

AGGGAAAGTTTACGCAAAGCCGCCATCTGGGAAGAACATTACGTATGAGTGCAAGTGCGGCGACTACAAGACCGGAACCG 

TTTCGACCCGCACCGAAATCACTGGTTGCACCGCCATCAAGCAGTGCGTCGCCTATAAGAGCGACCAAACGAAGTGGGTC 

TTCAACTCACCGGACTTGATCAGACATGACGACCACACGGCCCAAGGGAAATTGCATTTGCCTTTCAAGTTGATCCCGAG 

TACCTGCATGGTCCCTGTTGCCCACGCGCCGMTGTAATACATGGCTTTAAACACATCAGCCTCCAATTAGATACAGACC 
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ACTTGACATTGCTCACCACCAGGAGACTAGGGGCAAACCCGGAACCAACCACTGAATGGATCGTCGGAAAGACGGTCAGA 

AACTTCACCGTCGACCGAGATGGCCTGGAATACATATGGGGAAATCATGAGCCAGTGAGGGTCTATGCCCAAGAGTCAGC 

ACCAGGAGACCCTCACGGATGGCCACACGAAATAGTACAGCATTACTACCATCGCCATCCTGTGTACACCATCTTAGCCG 

TCGCATCAGCTACCGTGGCGATGATGATTGGCGTAACTGTTGCAGTGTTATGTGCCTGTAAAGCGCGCCGTGAGTGCCTG 

ACGCCATACGCCCTGGCCCCAAACGCCGTAATCCCAACTTCGCTGGCACTCTTGTGCTGCGTTAGGTCGGCCAATGCTGA 

AACGTTCACCGAGACCATGAGTTACTTGTGGTCGAACAGTCAGCCGTTCTTCTGGGTCCAGTTGTGCATACCTTTGGCCG 

CTTTCATCGTTCTAATGCGCTGCTGCTCCTGCTGCCTGCCTTTTTTAGTGGTTGCCGGCGCCTACCTGGCGAAGGTAGAC 

GCCTACGAACATGCGACCACTGTTCCAAATGTGCCACAGATACCGTATAAGGCACTTGTTGAAAGGGCAGGGTATGCCCC 

GCTCAATTTGGAGATCACTGTCATGTCCTCGGAGGTTTTGCCTTCCACCAACCAAGAGTACATTACCTGCAAATTCACCA 

CTGTGGTCCCCTCCCCAAAAATCAAATGCTGCGGCTCCTTGGAATGTCAGCCGGCCGCTCATGCAGACTATACCTGCAAG 

GTCTTCGGAGGGGTCTACCCCTTTATGTGGGGAGGAGCGCAATGTTTTTGCGACAGTGAGAACAGCCAGATGAGTGAGGC 

GTACGTCGAATTGTCAGCAGATTGCGCGTCTGACCACGCGCAGGCGATTAAGGTGCACACTGCCGCGATGAAAGTAGGAC 

TGCGTATTGTGTACGGGAACACTACCAGTTTCCTAGATGTGTACGTGAACGGAGTCACACCAGGAACGTCTAAAGACTTG 

AAAGTCATAGCTGGACCAATTTCAGCATCGTTTACGCCATTCGATCATAAGGTCGTTATCCATCGCGGCCTGGTGTACAA 

CTATGACTTCCCGGAATATGGAGCGATGAAACCAGGAGCGTTTGGAGACATTCAAGCTACCTCCTTGACTAGCAAGGATC 

TCATCGCCAGCACAGACATTAGGCTACTCAAGCCTTCCGCCAAGAACGTGCATGTCCCGTACACGCAGGCCTCATCAGGA 

TTTGAGATGTGGAAAAACAACTCAGGCCGCCCACTGCAGGAAACCGCACCTTTCGGGTGTAAGATTGCAGTAAATCCGCT 

CCGAGCGGTGGACTGTTCATACGGGAACATTCCCATTTCTATTGACATCCCGAACGCTGCCTTTATCAGGACATCAGATG 

CACCACTGGTCTCAACAGTCAAATGTGAAGTCAGTGAGTGCACTTATTCAGCAGACTTCGGCGGGATGGCCACCCTGCAG 

TATGTATCCGACCGCGAAGGTCAATGCCCCGTACATTCGCATTCGAGCACAGCAACTCTCCAAGAGTCGACAGTACATGT 

CCTGGAGAAAGGAGCGGTGACAGTACACTTTAGCACCGCGAGTCCACAGGCGAACTTTATCGTATCGCTGTGTGGGAAGA 

AGACAACATGCAATGCAGAATGTAAACCACCAGCTGACCATATCGTGAGCACCCCGCACAAAAATGACCAAGAATTTCAA 

GCCGCCATCTCAAAAACATCATGGAGTTGGCTGTTTGCCCTTTTCGGCGGCGCCTCGTCGCTATTAATTATAGGACTTAT 

GATTTTTGCTTGCAGCATGATGCTGACTAGCACACGAAGATGACCGCTACGCCCCAATGATCCGACCAGCAAAACTCGAT 

GTACTTCCGAGGAACTGATGTGCATAActogogGoATTCCGCCCCTCTCCCTCCCCCCCCCCTAACGTTACTGGCCGAAG 

CCGCTTGGAATAAGGCCGGTGTGCGTTTGTCTATATGTTATTTTCCACCATATTGCCGTCTTTTGGCAATGTGAGGGCCC 

GGAAACCTGGCCCTGTCTTCTTGACGAGCATTCCTAGGGGTCTTTCCCCTCTCGCCAAAGGAATGCAAGGTCTGTTGAAT 

GTCGTGAAGGAAGCAGTTCCTCTGGAAGCTTCTTGAAGACAAACAACGTCTGTAGCGACCCTTTGCAGGCAGCGGAACCC 

CCCACCTGGCGACAGGTGCCTCTGCGGCCAAAAGCCACGTGTATAAGATACACCTGCAAAGGCGGCACAACCCCAGTGCC 

ACGTTGTGAGTTGGATAGTTGTGGAMGAGTCAAATGGCTCTCCTCAAGCGTATTCAACAAGGGGCTGAAGGATGCCCAG 

AAGGTACCCCATTGTATGGGATCTGATCTGGGGCCTCGGTGCACATGCTTTACATGTGTTTAGTCGAGGTTAAAAAACGT 

CTAGGCCCCCCGAACCACGGGGACGTGGTTTTCCTTTGAAAAACACGATGATAAGCTTGCCACAACCATGgctgoacAAG 

ATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGC 

TCTGATGCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAA 

TGAACTGCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCA 

CTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAG 

AAAGTATCCATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAA 

ACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGCATCAGGGGC 

TCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCC 

TGCTTGCCGAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTA 

TCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACG 

GTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAG 1 1 c 1 1 c t gogcgggo t cggc t ogTCAG 

GCTGGTACATTAGATCCCCGCTTACCGCGGGCAATATAGCAACACTAAAAACTCGATGTACTTCCGAGGAAGCGCAGTgC 
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ATAATGCTGCGCaGTGTTGCCACATAACCACTATATTAACCATTTATCTAGCGGACGCCAAAAACTCAATGTATTTCTGA 

GGAAGCGTGGTGCATAATGCCACGCAGCGTCTGCATAACTTTTATTATTTCTTTTATTAATCAACAAAATTTTGTTTTTA 

ACATTTCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGGGAATTCCCAACTTGTTTATTGCAGCTTATAATGG 

TTACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAAC 

TCATCAATGTATCTTATCATGTCTGGATCCGTCGAGACGCGTccaottcgccctotagtgagtcgtottocgcgcgcTTG 

GCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACAGAACATACGAGCCGGAAGCAT 

AAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGG 

GAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCT 

TCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTT 

ATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCG 

CGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAAC 

CCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTAC 

CGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCAATGCTCACGCTGTAGGTATCTCAGTTCGGTGT 

AGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGT 

CTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGT 

AGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGC 

TGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTT 

GTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCA 

GTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAA 

AATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCT 

ATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTT 

ACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAG 

CCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGA 

GTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGG 

TATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCT 

CCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCT 

CTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCG 

GCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTG 

GAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCC 

AACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGG 

AATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTC 

TCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAMCAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCA 

C 
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CTAGAGATCTGCAGGTCGACGGATCCCCCGAGATTTTCAGGAGCTAAGGAAGCTAAAATGGAGAAAAAAATC 
ACTGGATATACCACCGTTGATATATCCCAATGGCATCGTAAAGAACATTTTGAGGCATTTCAGTCAGTTGCT 
CAATGTAACTATAACCAGACCGTTCAGCTGGATATTACGGCCTTTTTAAAGACCGTAAAGAAAMTAAGCAC 
AAGTTTTATCCGGCCTTTATTCACATTCTTGCCCGCCTGATGAATGCTCATCCGGAATTCCGTATGGCAATG 
AAAGACGGTGAGCTGGTGATATGGGATAGTGTTCACCCTTGTTACACCGTTTTCCATGAGCAAACTGAAACG 
TTTTCATCGCTCTGGAGTGAATACCACGACGATTTCCGGCAGTTTCTAGACATATATTCGCAAGATGTGGCG 
TGTTACGGTGAAAACCTGGCCTATTTCCCTAAAGGGTTTATTGAGAATATGTTTTTCGTCTCAGCCAATCCC 
TGGGTGAGTTTCACCAGTTTTGATTTAAACGTGGCCAATATGGACAACTTCTTCGCCCCCGTTTTCACCATG 
GGCAAATATTATACGCAAGGCGACAAGGTGCTGATGCCGCTGGCGATTCAGGTTCATCATGCCGTCTGTGAT 
GGCTTCCATGTCGGCAGAATGCTTAATGAATTACAACAGTACTGCGATGAGTGGCAGGGCGGGGCGTAATTT 
TTTTAAGGCAGTTATTGGTGCCTCGA 

FIG. 14 



Nucleotide sequence of Xbol Apol fragment of synthetic 
erythropoietin 

tctagoggcgcggagotgggggtgcacgaatgtcctgcctggctgtggcttctcctgtccctgctgtcgctc 
cctctgggcctcccagtcctgggcgccccoccacgcctcotctgtgQcagccgagtcctggagaggtocctc 
ttggoggccoaggaggccgogoatatcacgacgggctgtgctgoacactgcogcttgaatgogofltatcact 
gtcccQgacaccaaagttaatttctatgcctggaagoggotggoggtcgggcagcaggccgtogoogtctgg 
cogggcctggccctgctgtcggaogctgtcctgcggggccoggccctgttggtcaactcttcccogccgtgg 
gagcccctgcagctgcatgtggataoagccgtcagtggccttcgcogcctcoccoctctgcttcgggctctg 
ggagcccagoaggaagccatctcccctccagatgcggcctcagctgctccactccgaacaatcoctgctgQC 
octttccgcoaoctcttccgagtctactccaatttcctccggggooagctgaagctgtocacoggggaggcc 
tgcoggocaggggocogatgogcotgcaggccttgggCCC 

FIG. 15 
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